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A. predicted cDNA sequence of AtFtn2 (SEQ ID NO:l) 
(synonym: At5g42480; synonym: ARC6) gene 

Sequence length =24 06 nt 

Start codon (ATG) is at position 1-3 

Stop codon (TAA) is at position 2404-2406 

1 ATGGAAGCTC TGAGTCACGT CGGCATTGGT CTCTCCCCAT TCCAATTATG CCGATTACCA 
61 CCGGCGACGA CAAAGCTCCG ACGTAGCCAC AACACCTCTA CAACTATCTG CTCCGCCAGC 
121 AAATGGGCCG ACCGTCTTCT CTCCGACTTC AATTTCACCT. CCGATTCCTC CTCCTCCTCC 
181 TTCGCCACCG CCACCACCAC CGCCACTCTC GTCTCTCTGC CACCATCTAT TGATCGTCCC 
241 GAACGCCACG TCCCCATCCC CATTGATTTC TACCAGGTAT TAGGAGCTCA AACACATTTC 
3 01 TTAACCGATG GAATCAGAAG AGCATTCGAA GCTAGGGTTT CGAAACCGCC GCAATTCGGT 

3 61 TTCAGCGACG ACGCTTTAAT CAGCCGGAGA CAGATTCTTC AAGCTGCTTG CGAAACTCTG 
421 TCTAATCCTC GGTCTAGAAG AGAGTACAAT GAAGGTCTTC TTGATGATGA AGAAGCTACA 

4 81 GTCATCACTG ATGTTCCTTG GGATAAGGTT CCTGGGGCTC TCTGTGTATT GCAAGAAGGT 
541 GGTGAGACTG AGATAGTTCT TCGGGTTGGT GAGGCTCTGC TTAAGGAGAG GTTGCCTAAG 
601 TCGTTTAAGC AAGATGTGGT TTTAGTTATG GCGCTTGCGT TTCTCGATGT CTCGAGGGAT 
661 GCTATGGCAT TGGATCCACC TGATTTTATT ACTGGTTATG AGTTTGTTGA GGAAGCTTTG 
721 AAGCTTTTAC AGGAGGAAGG AGCAAGTAGC CTTGCACCGG ATTTACGTGC ACAAATTGAT 
781 GAGACTTTGG AAGAGATCAC TCCGCGTTAT GTCTTGGAGC TACTTGGCTT ACCGCTTGGT 
841 GATGATTACG CTGCGAAAAG ACTAAATGGT TTAAGCGGTG TGCGGAATAT TTTGTGGTCT 
901 GTTGGAGGAG GTGGAGCATC AGCTCTTGTT GGGGGTTTGA CCCGTGAGAA GTTTATGAAT 
961 GAGGCGTTTT TACGAATGAC AGCTGCTGAG CAGGTTGATC TTTTTGTAGC TACCCCAAGC 
1021 AATATTCCAG CAGAGTCATT TGAAGTTTAC GAAGTTGCAC TTGCTCTTGT GGCTCAAGCT 
1081 TTTATTGGTA AGAAGCCACA CCTTTTACAG GATGCTGATA AGCAATTCCA GCAACTTCAG 
1141 CAGGCTAAGG TAATGGCTAT GGAGATTCCT GCGATGTTGT ATGATACACG GAATAATTGG 

12 01 GAGATAGACT TCGGTCTAGA AAGGGGACTC TGTGCACTGC TTATAGGCAA AGTTGATGAA 
1261 TGCCGTATGT GGTTGGGCTT AGACAGTGAG GATTCACAAT ATAGGAATCC AGCTATTGTG 
1321 GAGTTTGTTT TGGAGAATTC AAATCGTGAT GACAATGATG ATCTCCCTGG ACTATGCAAA 

13 81 TTGTTGGAAA CCTGGTTGGC AGGGGTTGTC TTTCCTAGGT TCAGAGACAC CAAAGATAAA 
1441 AAATTTAAAC TCGGGGACTA CTATGATGAT CCTATGGTTT TGAGTTACTT GGAAAGAGTG 
1501 GAGGTAGTTC AGGGTTCTCC TTTAGCTGCT GCTGCAACTA TGGCAAGGAT TGGAGCCGAG 
1561 CATGTGAAAG CTAGTGCTAT GCAGGCACTG CAGAAAGTTT TTCCTTCCCG CTATACAGAT 
1621 AGAAACTCGG CTGAACCCAA GGATGTGCAA GAGACAGTGT TTAGTGTAGA TCCTGTTGGT 
16 81 AACAATGTAG GCCGTGATGG TGAGCCTGGT GTCTTTATTG CAGAAGCTGT AAGACCCTCT 
1741 GAAAACTTTG AAACTAATGA TTATGCAATT CGAGCTGGGG TCTCAGAGAG TAGCGTTGAT 
1801 GAAACTACTG TTGAAATGTC CGTTGCTGAT ATGTTAAAGG AGGCAAGTGT GAAGATCCTA 
1861 GCTGCTGGTG TGGCAATTGG ACTGATTTCA CTGTTCAGCC AGAAGTATTT TCTTAAAAGC 
1921 AGCTCATCTT TTCAACGCAA GGATATGGTT TCTTCTATGG AATCTGATGT CGCTACCATA 
1981 GGGTCAGTCA GAGCTGACGA TTCAGAAGCA CTTCCCAGAA TGGATGCTAG GACTGCAGAG 

2 041 AATATAGTAT CCAAGTGGCA GAAGATTAAG TCTCTGGCTT TTGGGCCTGA TCACCGCATA 
2101 GAAATGTTAC CAGAGGTTTT GGATGGGCGA ATGCTGAAGA TTTGGACTGA CAGAGCAGCT 
2161 GAAACTGCGC AGCTTGGGTT GGTTTATGAT TATACACTGT TGAAACTATC TGTTGACAGT 
2221 GTGACAGTCT CAGCAGATGG AACCCGTGCT CTGGTGGAAG CAACTCTGGA GGAGTCTGCT 
22 81 TGTCTATCTG ATTTGGTTCA TCCAGAAAAC AATGCTACTG ATGTCAGAAC CTACACAACA 
2 341 AGATACGAAG TTTTCTGGTC CAAGTCAGGG TGGAAAATCA CTGAAGGCTC TGTTCTTGCA 
24 01 TGATAA 
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B. Genomic sequence of AtFtn2 gene (SEQ ID NO:2) 
synonym: At5g42480; synonym: ARCS) 

Sequence length =3667 nt 

This sequence contains 480 nt of the 5' and 149 nt of the 3' region 
Start codon (ATG) is at position 481-483 
Stop codon (TAA) is at position 3516-3518 



1 TGTTCTGCAT TAAGGAGAAT ACAATTATAA GCAATTTGTC TTGATTTCAA CAAGATTTTG 
61 CTTGGCTATA GGATTCATTG GCTCTGTTTG CTTTTACATT TACATGTCAT AATAGTTTCG 
121 AATTTTACAC ATTTCAGTTG GATGTTAAGA AAAGAGAGGG AATTGATGGG GTTTTGTGGG 
181 TTTAAACTTT AAAGTAGTCA AGAATTAAGT CATTGGTTTA CTGTTGCTCT ATATGTGTAA 
241 AATGAAGGCA ACTCCAACGG TTCTTAGGTG GAATAGATTA TTTAGACGAT TTAACATCAT 
301 AAAGTCCGTG GCGACTGTAA CATCATAGAT TGTTTTTTAT TTTTTTCAGT AGCTGGTGAT 
3 61 GTTTTTTGAT TTAACTTATA CTACTCAAAA TCAAAATTCC ATAAACCCTA GACGACCAAA 
421 CAGTCTCTTC AATATGTAAA ACAGAACAAA GTTTTTGTAG TAGCCTAAAA AGACACTCCC 
481 ATGGAAGCTC TGAGTCACGT CGGCATTGGT CTCTCCCCAT TCCAATTATG CCGATTACCA 
541 CCGGCGACGA CAAAGCTCCG ACGTAGCCAC AACACCTCTA CAACTATCTG CTCCGCCAGC 
601 AAATGGGCCG ACCGTCTTCT CTCCGACTTC AATTTCACCT CCGATTCCTC CTCCTCCTCC 
661 TTCGCCACCG CCACCACCAC CGCCACTCTC GTCTCTCTGC CACCATCTAT TGATCGTCCC 
721 GAACGCCACG TCCCCATCCC CATTGATTTC TACCAGGTAT TAGGAGCTCA AACACATTTC 
781 TTAACCGATG GAATCAGAAG AGCATTCGAA GCTAGGGTTT CGAAACCGCC GCAATTCGGT 
841 TTCAGCGACG ACGCTTTAAT CAGCCGGAGA CAGATTCTTC AAGCTGCTTG CGAAACTCTG 
901 TCTAATCCTC GGTCTAGAAG AGAGTACAAT GAAGGTCTTC TTGATGATGA AGAAGCTACA 
961 GTCATCACTG ATGTTCCTTG GGATAAGGTA ATTTCGATTT CGGAATAATA AAGTTTCTTC 
1021 GTTTTAATTT CATGAATTGG ATAAAGGAAG GAACTTTTAT CTAGTGAAGG TTCCTGGGGC 
1081 TCTCTGTGTA TTGCAAGAAG GTGGTGAGAC TGAGATAGTT CTTCGGGTTG GTGAGGCTCT 
1141 GCTTAAGGAG AGGTTGCCTA AGTCGTTTAA GCAAGATGTG GTTTTAGTTA TGGCGCTTGC 
12 01 GTTTCTCGAT GTCTCGAGGG ATGCTATGGC ATTGGATCCA CCTGATTTTA TTACTGGTTA 

12 61 TGAGTTTGTT GAGGAAGCTT TGAAGCTTTT ACAGGTAGTT TGACTTGCTT TGGTAATTTG 

13 21 ACGAGCGTTG GCTTTATAAG AACTTTCTTG ATTTGATACT TTGTTATTGA GTCTTGTGTA 
13 81 GGAGGAAGGA GCAAGTAGCC TTGCACCGGA TTTACGTGCA CAAATTGATG AGACTTTGGA 
1441 AGAGATCACT CCGCGTTATG TCTTGGAGCT ACTTGGCTTA CCGCTTGGTG ATGATTACGC 
15 01 TGCGAAAAGA CTAAATGGTT TAAGCGGTGT GCGGAATATT TTGTGGTCTG TTGGAGGAGG 
1561 TGGAGCATCA GCTCTTGTTG GGGGTTTGAC CCGTGAGAAG TTTATGAATG AGGCGTTTTT 
1621 ACGAATGACA GCTGCTGAGC AGGTATACAG TTTAGATACC TTTTTTTAAT TTCTTTAGCA 
1681 TGATATAACT TTAGGTTTCT CATTTTAATG TATGTTGTGT GGTAGGTTGA TCTTTTTGTA 
1741 GCTACCCCAA GCAATATTCC AGCAGAGTCA TTTGAAGTTT ACGAAGTTGC ACTTGCTCTT 
18 01 GTGGCTCAAG CTTTTATTGG TAAGAAGCCA CACCTTTTAC AGGATGCTGA TAAGCAATTC 
1861 CAGCAACTTC AGCAGGCTAA GGTAATGGCT ATGGAGATTC CTGCGATGTT GTATGATACA 
1921 CGGAATAATT GGGAGATAGA CTTCGGTCTA GAAAGGGGAC TCTGTGCACT GCTTATAGGC 
1981 AAAGTTGATG AATGCCGTAT GTGGTTGGGC TTAGACAGTG AGGATTCACA ATATAGGAAT 

2 041 CCAGCTATTG TGGAGTTTGT TTTGGAGAAT TCAAATCGTG ATGACAATGA TGATCTCCCT 
2101 GGACTATGCA AATTGTTGGA AACCTGGTTG GCAGGGGTTG TCTTTCCTAG GTTCAGAGAC 
2161 ACCAAAGATA AAAAATTTAA ACTCGGGGAC TACTATGATG ATCCTATGGT TTTGAGTTAC 
2221 TTGGAAAGAG TGGAGGTAGT TCAGGGTTCT CCTTTAGCTG CTGCTGCAAC TATGGCAAGG 
22 81 ATTGGAGCCG AGCATGTGAA AGCTAGTGCT ATGCAGGCAC TGCAGAAAGT TTTTCCTTCC 
2341 CGCTATACAG ATAGAAACTC GGCTGAACCC AAGGATGTGC AAGAGACAGT GTTTAGTGTA 
24 01 GATCCTGTTG GTAACAATGT AGGCCGTGAT GGTGAGCCTG GTGTCTTTAT TGCAGAAGCT 
24 61 GTAAGACCCT CTGAAAACTT TGAAACTAAT GATTATGCAA TTCGAGCTGG GGTCTCAGAG 
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2 521 AGTAGCGTTG ATGAAACTAC TGTTGAAATG 
2 5 81 GTGAAGATCC TAGCTGCTGG TGTGGCAATT 
2641 TTTCTTAAAA GCAGCTCATC TTTTCAACGC 
27 01 GTCGCTACCA TAGGTATGAT TAAATGATGC 
2761 TGCTTGTTTT GTGAGCTAAG AACATAGTTC 
2 821 AAGATTAACA AGTTGCTGAG TAAATTTCAC 
2 8 81 CTGTAGACAG AAATGTAAAT TTCACTCTCA 

2 941 AGATTGCCTT AGTGTGGCTT TGTCCAACTT 

3 001 AGGGTCAGTC AGAGCTGACG ATTCAGAAGC 
3 0 61 GAATATAGTA TCCAAGTGGC AGAAGATTAA 
3121 AGAAATGTTA CCAGAGGTGA GGGAATAAAT 
3181 TGGACATGAT TATAGTCTGG TGCCTTGTTT 
3241 GAATGCTGAA GATTTGGACT G AC AG AG C AG 

33 01 ATTATACACT GTTGAAACTA TCTGTTGACA 
3 3 61 CTCTGGTGGA AGCAACTCTG GAGGAGTCTG 

34 21 ACAATGCTAC TGATGTCAGA ACCTACACAA 
3481 GGTGGAAAAT CACTGAAGGC TCTGTTCTTG 
3541 GAGCTTGCGA GATTCTCTTT GTTCTGTAAA 
3601 ACACAAAAAA ATTAACGTTC TTGGCACACC 
3 661 GCTACAA 



TCCGTTGCTG ATATGTTAAA GGAGGCAAGT 
GGACTGATTT CACTGTTCAG CCAGAAGTAT 
AAGGATATGG TTTCTTCTAT GGAATCTGAT 
AATTTTCATA TATCTGCATT GCTCAAAATA 
CCACTTAATA CATGTCCCAA AAGTTGTACC 
TAATTATGCT GCTTGAATTT TTTGATCAAA 
ACATTTCTGT TTAGAATAAC GTAGGATTAG 
TTCTTTCCTT GATTTTTTTC TTTTCGATTT 
ACTTCCCAGA ATGGATGCTA GGACTGCAGA 
GTCTCTGGCT TTTGGGCCTG ATCACCGCAT 
CTACAATTCA ATCAATTGTG TGAAAACTGT 
GATTCTGTTA TTTATAGGTT TTGGATGGGC 
CTGAAACTGC GCAGCTTGGG TTGGTTTATG 
GTGTGACAGT CTCAGCAGAT GGAACCCGTG 
CTTGTCTATC TGATTTGGTT CAT C C AG AAA 
CAAGATACGA AGTTTTCTGG TCCAAGTCAG 
CATCATAATA TACTCATATG TAGCATGTCT 
TTCTCTCTCT AAGTTAGTGT TTATAAATGA 
CTTTTCCTTG ATCTAAACTA TAACATAAGG 



FIG. 1 continued 4/6 



C. predicted cDNA sequence of mutated AtFtn2 gene (SEQ ID NO:9) 
synonym: At5g42480; synonym: ARC6) 

Sequence length = 2406 nt 

Start codon (ATG) is at position 1-3 

Premature stop codon (TGA) is at position 973-975 

Stop codon (TAA) is at position 2404-2406 



1 ATGGAAGCTC TGAGTCACGT CGGCATTGGT CTCTCCCCAT TCCAATTATG . CCGATTACCA 
61 CCGGCGACGA CAAAGCTCCG ACGTAGCCAC AACACCTCTA CAACTATCTG CTCCGCCAGC 
121 AAATGGGCCG ACCGTCTTCT CTCCGACTTC AATTTCACCT CCGATTCCTC CTCCTCCTCC 
181 TTCGCCACCG CCACCACCAC CGCCACTCTC GTCTCTCTGC CACCATCTAT 'TGATCGTCCC 
241 GAACGCCACG TCCCCATCCC CATTGATTTC TACCAGGTAT TAGGAGCTCA AACACATTTC 
3 01 TTAACCGATG GAATCAGAAG AGCATTCGAA GCTAGGGTTT CGAAACCGCC GCAATTCGGT 
361 TTCAGCGACG ACGCTTTAAT CAGCCGGAGA CAGATTCTTC AAGCTGCTTG CGAAACTCTG 
421 TCTAATCCTC GGTCTAGAAG AGAGTACAAT GAAGGTCTTC TTGATGATGA AGAAGCTACA 
481 GTCATCACTG ATGTTCCTTG GGATAAGGTT CCTGGGGCTC TCTGTGTATT GCAAGAAGGT 
541 GGTGAGACTG AGATAGTTCT TCGGGTTGGT GAGGCTCTGC TTAAGGAGAG GTTGCCTAAG 
601 TCGTTTAAGC AAGATGTGGT TTTAGTTATG GCGCTTGCGT TTCTCGATGT CTCGAGGGAT 
661 GCTATGGCAT TGGATCCACC TGATTTTATT ACTGGTTATG AGTTTGTTGA GGAAGCTTTG 
721 AAGCTTTTAC AGGAGGAAGG AGCAAGTAGC CTTGCACCGG ATTTACGTGC ACAAATTGAT 
781 GAGACTTTGG AAG AG AT C AC TCCGCGTTAT GTCTTGGAGC TACTTGGCTT ACCGCTTGGT 
841 GATGATTACG CTGCGAAAAG ACTAAATGGT TTAAGCGGTG TGCGGAATAT TTTGTGGTCT 
901 GTTGGAGGAG GTGGAGCATC AGCTCTTGTT GGGGGTTTGA CCCGTGAGAA GTTTATGAAT 
961 GAGGCGTTTT TATGAATGAC AGCTGCTGAG CAGGTTGATC TTTTTGTAGC TACCCCAAGC 
1021 AATATTCCAG CAGAGTCATT TGAAGTTTAC GAAGTTGCAC TTGCTCTTGT GGCTCAAGCT 
10 81 TTTATTGGTA AGAAGCCACA CCTTTTACAG GATGCTGATA AGCAATTCCA GCAACTTCAG 
1141 CAGGCTAAGG TAATGGCTAT GGAGATTCCT GCGATGTTGT ATGATACACG GAATAATTGG 

12 01 GAGATAGACT TCGGTCTAGA AAGGGGACTC TGTGCACTGC TTATAGGCAA AGTTGATGAA 
1261 TGCCGTATGT GGTTGGGCTT AGACAGTGAG GATTCACAAT ATAGGAATCC AGCTATTGTG 
1321 GAGTTTGTTT TGGAGAATTC AAATCGTGAT GACAATGATG ATCTCCCTGG ACTATGCAAA 

13 81 TTGTTGGAAA CCTGGTTGGC AGGGGTTGTC TTTCCTAGGT TCAGAGACAC CAAAGATAAA 
1441 AAATTTAAAC TCGGGGACTA CTATGATGAT CCTATGGTTT TGAGTTACTT GGAAAGAGTG 
1501 GAGGTAGTTC AGGGTTCTCC TTTAGCTGCT GCTGCAGCTA TGGCAAGGAT TGGAGCCGAG 
1561 CATGTGAAAG CTAGTGCTAT GCAGGCACTG CAGAAAGTTT TTCCTTCCCG CTATACAGAT 
1621 AGAAACTCGG CTGAACCCAA GGATGTGCAA GAGACAGTGT TTAGTGTAGA TCCTGTTGGT 
1681 AACAATGTAG GCCGTGATGG TGAGCCTGGT GTCTTTATTG CAGAAGCTGT AAGACCCTCT 
1741 GAAAACTTTG AAACTAATGA TTATGCAATT CGAGCTGGGG TCTCAGAGAG TAGCGTTGAT 
1801 GAAACTACTG TTGAAATGTC CGTTGCTGAT ATGTTAAAGG AGGCAAGTGT GAAGATCCTA 
1861 GCTGCTGGTG TGGCAATTGG ACTGATTTCA CTGTTCAGCC AGAAGTATTT TCTTAAAAGC 
1921 AGCTCATCTT TTCAACGCAA GGATATGGTT TCTTCTATGG AATCTGATGT CGCTACCATA 
1981 GGGTCAGTCA GAGCTGACGA TTCAGAAGCA CTTCCCAGAA TGGATGCTAG GACTGCAGAG 
2041 AATATAGTAT CCAAGTGGCA GAAGATTAAG TCTCTGGCTT TTGGGCCTGA TCACCGCATA 
2101 GAAATGTTAC CAGAGGTTTT GGATGGGCGA ATGCTGAAGA TTTGGACTGA CAGAGCAGCT 
2161 GAAACTGCGC AGCTTGGGTT GGTTTATGAT TATACACTGT TGAAACTATC TGTTGACAGT 

2 221 GTGACAGTCT CAGCAGATGG AACCCGTGCT CTGGTGGAAG CAACTCTGGA GGAGTCTGCT 
22 81 TGTCTATCTG ATTTGGTTCA TCCAGAAAAC AATGCTACTG ATGTCAGAAC CTACACAACA 
2 341 AGATACGAAG TTTTCTGGTC CAAGTCAGGG TGGAAAATCA CTGAAGGCTC TGTTCTTGCA 
24 01 TCATAA 
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D. Genomic sequence of mutated AtFtn2 gene (SEQ ID NO: 10) 
(synonym: At5g42480; synonym: ARC6) 

Sequence length = 3 667 nt 

This sequence contains 480 nt of the 5' and 149 nt of the 3* region 

Start codon (ATG) is at position 481-483 

Premature stop codon (TGA) is at position 1622-1624 

Stop codon (TAA) is at position 3516-3518 

1 TGTTCTGCAT TAAGGAGAAT ACAATTATAA GCAATTTGTC TTGATTTCAA CAAGATTTTG 
61 CTTGGCTATA GGATTCATTG GCTCTGTTTG CTTTTACATT TACATGTCAT AATAGTTTCG 
121 AATTTTACAC ATTTCAGTTG GATGTTAAGA AAAGAGAGGG AATTGATGGG GTTTTGTGGG 
181 TTTAAACTTT AAAGTAGTCA AGAATTAAGT CATTGGTTTA CTGTTGCTCT ATATGTGTAA 
241 AATGAAGGCA ACTCCAACGG TTCTTAGGTG GAATAGATTA TTTAGACGAT TTAACATCAT 
301 AAAGTCCGTG GCGACTGTAA CATCATAGAT TGTTTTTTAT TTTTTTCAGT AGCTGGTGAT 
361 GTTTTTTGAT TTAACTTATA CTACTCAAAA TCAAAATTCC ATAAACCCTA GACGACCAAA 
421 CAGTCTCTTC AATATGTAAA ACAGAACAAA GTTTTTGTAG TAGCCTAAAA AGACACTCCC 
4 81 ATGGAAGCTC TGAGTCACGT CGGCATTGGT CTCTCCCCAT TCCAATTATG CCGATTACCA 
541 CCGGCGACGA CAAAGCTCCG ACGTAGCCAC AACACCTCTA CAACTATCTG CTCCGCCAGC 
601 AAATGGGCCG ACCGTCTTCT CTCCGACTTC AATTTCACCT CCGATTCCTC CTCCTCCTCC 
661 TTCGCCACCG CCACCACCAC CGCCACTCTC GTCTCTCTGC CACCATCTAT TGATCGTCCC 
721 GAACGCCACG TCCCCATCCC CATTGATTTC TACCAGGTAT TAGGAGCTCA AACACATTTC 
781 TTAACCGATG GAATCAGAAG AGCATTCGAA GCTAGGGTTT CGAAACCGCC GCAATTCGGT 
841 TTCAGCGACG ACGCTTTAAT CAGCCGGAGA CAGATTCTTC AAGCTGCTTG CGAAACTCTG 
901 TCTAATCCTC GGTCTAGAAG AGAGTACAAT GAAGGTCTTC TTGATGATGA AGAAGCTACA 
961 GTCATCACTG ATGTTCCTTG GGATAAGGTA ATTTCGATTT CGGAATAATA AAGTTTCTTC 
1021 GTTTTAATTT CATGAATTGG ATAAAGGAAG GAACTTTTAT CTAGTGAAGG TTCCTGGGGC 
1081 TCTCTGTGTA TTGCAAGAAG GTGGTGAGAC TGAGATAGTT CTTCGGGTTG GTGAGGCTCT 
1141 GCTTAAGGAG AGGTTGCCTA AGTCGTTTAA GCAAGATGTG GTTTTAGTTA TGGCGCTTGC 
12 01 GTTTCTCGAT GTCTCGAGGG ATGCTATGGC ATTGGATCCA CCTGATTTTA TTACTGGTTA 

12 61 TGAGTTTGTT GAGGAAGCTT TGAAGCTTTT ACAGGTAGTT TGACTTGCTT TGGTAATTTG 

13 21 ACGAGCGTTG GCTTTATAAG AACTTTCTTG ATTTGATACT TTGTTATTGA GTCTTGTGTA 
13 81 GGAGGAAGGA GCAAGTAGCC TTGCACCGGA TTTACGTGCA CAAATTGATG AGACTTTGGA 
1441 AGAGATCACT CCGCGTTATG TCTTGGAGCT ACTTGGCTTA CCGCTTGGTG ATGATTACGC 
15 01 TGCGAAAAGA CTAAATGGTT TAAGCGGTGT GCGGAATATT TTGTGGTCTG TTGGAGGAGG 
15 61 TGGAGCATCA GCTCTTGTTG GGGGTTTGAC CCGTGAGAAG TTTATGAATG AGGCGTTTTT 
1621 ATGAATGACA GCTGCTGAGC AGGTATACAG TTTAGATACC TTTTTTTAAT TTCTTTAGCA 
1681 TGATATAACT TTAGGTTTCT CATTTTAATG TATGTTGTGT GGTAGGTTGA TCTTTTTGTA 
1741 GCTACCCCAA GCAATATTCC AGCAGAGTCA TTTGAAGTTT ACGAAGTTGC ACTTGCTCTT 
18 01 GTGGCTCAAG CTTTTATTGG TAAGAAGCCA CACCTTTTAC AGGATGCTGA TAAGCAATTC 
1861 CAGCAACTTC AGCAGGCTAA GGTAATGGCT ATGGAGATTC CTGCGATGTT GTATGATACA 
1921 CGGAATAATT GGGAGATAGA CTTCGGTCTA GAAAGGGGAC TCTGTGCACT GCTTATAGGC 
1981 AAAGTTGATG AATGCCGTAT GTGGTTGGGC TTAGACAGTG AGGATTCACA ATATAGGAAT 

2 041 CCAGGTATTG TGGAGTTTGT TTTGGAGAAT TCAAATCGTG ATGACAATGA TGATCTCCCT 
2101 GGACTATGCA AATTGTTGGA AACCTGGTTG GCAGGGGTTG TCTTTCCTAG GTTCAGAGAC 
2161 ACCAAAGATA AAAAATTTAA ACTCGGGGAC TACTATGATG ATCCTATGGT TTTGAGTTAC 
22 21 TTGGAAAGAG TGGAGGTAGT TCAGGGTTCT CCTTTAGCTG CTGCTGCAGC TATGGCAAGG 
22 81 ATTGGAGCCG AGCATGTGAA AGCTAGTGCT ATGCAGGCAC TGCAGAAAGT TTTTCCTTCC 
2 341 CGCTATACAG ATAGAAACTC GGCTGAACCC AAGGATGTGC AAGAGACAGT GTTTAGTGTA 
24 01 GATCCTGTTG GTAACAATGT AGGCCGTGAT GGTGAGCCTG GTGTCTTTAT TGCAGAAGCT 
24 61 GTAAGACCCT CTGAAAACTT TGAAACTAAT GATTATGCAA TTCGAGCTGG GGTCTCAGAG 
2521 AGTAGCGTTG ATGAAACTAC TGTTGAAATG TCCGTTGCTG ATATGTTAAA GGAGGCAAGT 
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2581 GTGAAGATCC TAGCTGCTGG TGTGGCAATT 
2641 TTTCTTAAAA GCAGCTCATC TTTTCAACGC 
2701 GTCGCTACCA TAGGTATGAT TAAATGATGC 
2761 TGCTTGTTTT GTGAGCTAAG AACATAGTTC 
2 821 AAGATTAACA AGTTGCTGAG TAAATTTCAC 
2 881 CTGTAGACAG AAATGTAAAT TTCACTCTCA 

2 941 AGATTGCCTT AGTGTGGCTT TGTCCAACTT 

3 001 AGGGTCAGTC AGAGCTGACG ATTCAGAAGC 
3 061 GAATATAGTA TCCAAGTGGC AGAAGATTAA 
3121 AGAAATGTTA CCAGAGGTGA GGGAATAAAT 
3181 TGGACATGAT TATAGTCTGG TGCCTTGTTT 
3241 GAATGCTGAA GATTTGGACT GACAGAGCAG 
3 301 ATTATACACT GTTGAAACTA TCTGTTGACA 
3361 CTCTGGTGGA AGCAACTCTG GAGGAGTCTG 
3421 ACAATGCTAC TGATGTCAGA ACCTACACAA 
34 81 GGTGGAAAAT CACTGAAGGC TCTGTTCTTG 
3541 GAGCTTGCGA GATTCTCTTT GTTCTGTAAA 
3601 ACACAAAAAA ATTAACGTTC TTGGCACACC 
3 661 GCTACAA 



GGACTGATTT CACTGTTCAG CCAGAAGTAT 
AAGGATATGG TTTCTTCTAT GGAATCTGAT 
AATTTTCATA TATCTGCATT GCTCAAAATA 
CCACTTAATA CATGTCCCAA AAGTTGTACC 
TAATTATGCT GCTTGAATTT TTTGATCAAA 
ACATTTCTGT TTAGAATAAC GTAGGATTAG 
TTCTTTCCTT GATTTTTTTC TTTTCGATTT 
ACTTCCCAGA ATGGATGCTA GGACTGCAGA 
GTCTCTGGCT TTTGGGCCTG ATCACCGCAT 
CTACAATTCA ATCAATTGTG TGAAAACTGT 
GATTCTGTTA TTTATAGGTT TTGGATGGGC 
CTGAAACTGC GCAGCTTGGG TTGGTTTATG 
GTGTGACAGT CTCAGCAGAT GGAACCCGTG 
CTTGTCTATC TGATTTGGTT CAT G C AG AAA 
CAAGATACGA AGTTTTCTGG TCCAAGTCAG 
CATCATAATA TACTCATATG TAGCATGTCT 
TTCTCTCTCT AAGTTAGTGT TTATAAATGA 
CTTTTCCTTG ATCTAAACTA TAACATAAGG 
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Amino Acid Sequences 

A. predicted amino acid sequence of AtFtn2 

(synonym: At5g42480; synonym: ARC6) protein 

Sequence length = 801 aa 

1 MEALSHVGIG LSPFQLCRLP PATTKLRRSH NTSTTICSAS KWADRLLSDF NFTSDSSSSS 

61 FATATTTATL VSLPPSIDRP ERHVPIPIDF YQVLGAQTHF LTDGIRRAFE ARVSKPPQFG 

121 FSDDALISR.R QILQAACETL SNPRSRREYN EGLLDDEEAT VITDVPWDKV PGALCVLQEG 

181 GETEIVLRVG EALLKERLPK SFKQDWLVM ALAFLDVSRD AMALDPPDFI TGYEFVEEAL 

241 KLLQEEGASS LAPDLRAQID ETLEEITPRY VLELLGLPLG DDYAAKRLNG LSGVRNILWS 

3 01 VGGGGASALV GGLTREKFMN EAFLRMTAAE QVDLFVATPS NIPAESFEVY EVALALVAQA 
361 FIGKKPHLLQ DADKQFQQLQ QAKVMAMEIP AMLYDTRNNW EIDFGLERGL CALLIGKVDE 
421 CRMWLGLDSE DSQYRNPAIV EFVLEMSNRD DNDDLPGLCK LLETWLAGW FPRFRDTKDK 

4 81 KFKLGDYYDD PMVLSYLERV EWQGSPLAA AATMARIGAE HVKASAMQAL QKVFPSRYTD 
541 RNSAEPKDVQ ETVFSVDPVG NNVGRDGEPG VFIAEAVRPS ENFETNDYAI RAGVSESSVD 
601 ETTVEMSVAD MLKEASVKIL AAGVAIGLIS LFSQKYFLKS SSSFQRKDMV SSMESDVATI 
661 GSVRADDSEA LPRMDARTAE NIVSKWQKIK SLAFGPDHRI EMLPEVLDGR MLKIWTDRAA 
721 ETAQLGLVYD YTLLKLSVDS VTVSADGTRA LVEATLEESA CLSDLVHPEN NATDVRTYTT 
781 RYEVFWSKSG WKITEGSVLA S* 



B. predicted amino acid sequence of mutated AtFtn2 
(synonym: At5g42480; synonym: ARC6) protein 

Sequence length = 324 aa 

The mutated protein is truncated as a result of arc6 mutation 
(premature stop ) 

1 MEALSHVGIG LSPFQLCRLP PATTKLRRSH NTSTTICSAS KWADRLLSDF NFTSDSSSSS 
61 FATATTTATL VSLPPSIDRP ERHVPIPIDF YQVLGAQTHF LTDGIRRAFE ARVSKPPQFG 
121 FSDDALISRR QILQAACETL SNPRSRREYN EGLLDDEEAT VITDVPWDKV PGALCVLQEG 
181 GETEIVLRVG EALLKERLPK SFKQDWLVM ALAFLDVSRD AMALDPPDFI TGYEFVEEAL 
241 KLLQEEGASS LAPDLRAQID ETLEEITPRY VLELLGLPLG DDYAAKRLNG LSGVRNILWS 
3 01 VGGGGASALV GGLTREKFMN EAFL* 
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FIG. 6 



Synechococcus sp. PCC 7942 cell division protein Ftn2 gene 
A. Ftn2 DNA nucleic acid sequence (SEQ ID NO:4) 



1 cttgccgact aaaggctaag catcgccatt ccttagatta aagcagtctg tcggcggcgc 
61 tgtgccggtt aacaccagtc tgtcgctgac agcggtgcct ttctggggct tgcctgtggg 
121 gcgagtaacc gatcgctggg ataagagttg gtgcttctgg ctctcaagaa tagggttttc 
181 cgtcgcgtat tcccgatcac atccccctgt gtctgctacg gagataacgc cgatcactca 
241 acagaattgg taagttgacg gtcaagttgg gatgatgaag tcggctcaag ctggcgatcc 
301 ggatctggtg ggtgttctgt gcgtattcct ctcgattact accgaattct ctgtgttggc 
361 gtgcaagcct cggcagacaa acttgccgaa agctaccgcg atcgcctcaa ccaatcgccc 
421 tcccatgagt tttcagagct ggcattgcag gcgcggcggc aactcctcga agcagcgatt 
481 gctgagctga gtgatcccga acagcgcgat cgctacgatc gccgcttttt tcagggcggt 
541 ctggaagcga ttgaaccaag cctagaactc gaagactggc agcgaattgg agccctgctg 
601 atcctgctgg aattggggga atacgatcgc gtttcgcaac tggctgagga actcctgcca 
661 gactacgacg cgagcgcaga agtacgcgat cagttcgcgc ggggtgatat cgccttggcg 
721 atcgcactat cccagcaatc cctcggtcga gaatgccgtc agcagggtct gtacgaacag 
781 gccgcccagc actttggccg cagccagtct gccctagccg atcatcagcg ctttcctgaa 
841 ctgagtcgaa ccctgcacca agaacaagga cagctacggc cctatcgcat tttggagcgg 
901 ttggcccagc ccttgactgc cgatagcgat cgccagcagg gtttgctgtt gttgcaggcg 
961 atgttggacg accggcaggg cattgaaggc cctggggatg atggctcggg gctgaccctt 
1021 gataactttt tgatgtttct ccagcaaatt cgcggctatc tgaccctggc tgaacagcag 
1081 ttgctgtttg aatcggaagc gcgtcggccc tcgccggctg cgagcttttt tgcctgctac 
1141 accctgattg cgcggggctt ttgcgatcac caaccctcgt tgatccatcg cgccagcttg 
1201 ctcttgcatg aactcaagag ccgcatggat gtgcacatcg aacaggcgat cgccagccta 
1261 ttgctcggac agcccgaaga agctgaggcg ctactcgtcc agagccaaga tgaggaaacc 
1321 ctcagccaaa tccgtgccct agcccaaggg gaagccctga tcgtcggttt gtgccgattc 
1381 acggaaacct ggctagcgac caaggtattt ccggatttcc gcgacctcaa ggaaaggact 
1441 gcgccgctgc agccctactt tgacgacccc gatgtccaga cctatctgga tgcgatcgtg 
1501 gagttgccgt ccgatttgat gccaacgccg ctacccgttg agccgcttga ggtgcgatcg 
1561 tcgttgctgg ccaaggaact gccgacccca gcaacgcctg gtgtagctcc accccctcgc 
1621 cgccgtcgcc gcgatcgctc cgaacgtcct gctcgcacgg ccaaacgctt gcccttgccc 
1681 tggattggtt tgggggttgt ggtggttctc ggcggtggaa caggggtttg ggcttggcga 
1741 tcgcgttcca attccacccc gccgaccccg ccccccgtgg ttcaaacgct gcctgaggcg 



FIG. 6 continued (2/2) 



1801 gtacctgccc cttcgcccgc gccagttacc gttgccctcg atcgggctca ggctgaaact 
1861 gtgttgcaaa actggttggc cgctaaagct gcagccttgg ggcctcaata cgatcgcgat 
1921 cgcttagcga cggtgctgac cggtgaggtt ctgcagactt ggcagggttt ttctagccag 
1981 caggccaaca cccagctcac atcacagttc gatcacaagt taaccgtcga ctcagttcag 
2041 ctcagtgacg gtgateaacg agcagtagtc caagccaagg tcgatgaagt tgagcaggtc 
2101 tatcgaggcg accagctgct cgaaacgcgc cgagatttgg gcttggtgat ccgctaccag 
2161 ctcgtgcgcg agaacaacat ctggaaaatt gcttcgatta gtttggtgcg ctaggaattc 
2221 gcaaggggtg aaccccctgc ggtcttttct gtagatcccc tagagcgatc gcagaatgtt 
2281 cagcgattcc tggatgtgcg cttgggcatt caagagtgaa tcaaaaatgt ggcgcacctt 
2341 gccctctttg tcgatcacat aagtgacgcg acccggaatc acaaacaggg ttttgggcac 
2401 gccataggtt tgacggaggc gatcgcctgc atcgctcagc agttggaagg gcaagttgta 
2461 tttctgggc 



B. Ftn2 Protein amino acid sequence (SEQ ID NO:5) 

translation= M MRffLDYYRILCVGVQASADKLAESYRDRLNQSPSHEFSELALQ 

ARRQLLEAAIAELSDPEQRDRYDRRFFQGGLEAIEPSLELEDWQRIGALLILLELGEY 

DRVSQLAEELLPDYDASAEVRDQFARGDIALAIALSQQSLGRECRQQGLYEQAAQHFG 

RSQSALADHQRFPELSRTLHQEQGQLRPYRILERLAQPLTADSDRQQGLLLLQAMLDD 

RQGIEGPGDDGSGLTLDNFLMFLQQIRGYLTLAEQQLLFESEARRPSPAASFFACYTL 

IARGFCDHQPSLIHRASLLLHELKSRMDVHIEQAIASLLLGQPEEAEALLVQSQDEET 

LSQIRALAQGEALIVGLCRFTETWLATKVFPDFRDLKERTAPLQPYFDDPDVQTYLDA 

IVELPSDLMPTPLPVEPLEVRSSLLAKELPTPATPGVAPPPRRRRRDRSERPARTAKR 

LPLPWIGLGVVVVLGGGTGVWAWRSRSNSTPPTPPPVVQTLPEAVPAPSPAPVTVALD 

RAQAETVLQNWLAAKAAALGPQYDRDRLATVLTGEVLQTWQGFSSQQANTQLTSQFD 

HKLTVDSVQLSDGDQRAVVQAKVDEVEQVYRGDQLLETRRDLGLVIRYQLVRENNIW 

KIASISLVR" 



FIG. 7 

Synechococcus sp. PCC 7942 cell division protein Ftn6 gene 

A. Ftn6 DNA nucleic acid sequence (SEQ ID NO:6) 

1 ctcgatactt gggagttgaa cacagagtag tagtctaagt aacaactgct cgtgagcaat 
61 ttgctacact ttttaccaaa ttttgagctc agttttcgcg aaaactggga tgttgagttg 
121 aaccctcagc agcaaaattg taccgcctga gacttttacc gttttattcg gccatctggg 
181 aacaatcgcc ctggagctta ttgtgacctc tacccgtact gccgttattg ccttgttaga 
241 acgctatttc gagctgtcgg cagcgcgagc agcagaggtc ttgcagcaac tgcgatcgca 
301 ccaccctgaa gcctggattt atcccgccac agtcgaggcg atttaccaag gccgttaccg 
361 ctgggtgtcg atcgcacaaa tccttgctct gtggcagcgg cgcgggcaga tcaactgcca 
421 cttcagtgca gactatgagc gcttgttgct cggtgaagtt ccagagcaac ccgatcgcat 
481 caatgttgag acgcggctcc ctgcgatcgc catgaccttg ccttgggtgc cagaacagcc 
541 tggagaagca ttcgtgccag cgcaagatca gtcgggttta actgagcgcc tttataaaac 
601 gttggtcaaa gcgggcagcg attgcgctgg gtaggcttag aacagttgcc atccaaactt 
661 gagagtgccc gttcggccag ccaagagaat tccaagagcc tttcagaacg gacaacaatt 
721 ctgctctaca atcaagcccg agtgaagagg cggcgggcta ttggctgaat ggcaaaaaac 
781 atcattcttt cagcaatcgt gggttatacc tacgacaaaa ttgacctatt cttaacttct 
841 gcactccgta acacctcagc agatattctt ttaattgcat caagtccttc agcccaactc 
901 cgtcatcagt tattgagttc acctcgggtc aaactcgttg atgtgaacct tcaaggtgaa 
961 ccagctgaaa tggtatttcg ccgtttcttt attgccaagg agattttggc gagaatcgaa 
1021 gcagatgaaa ttctcttgag cgatgctcgc gatgtctatt tccaatctga cccttttggt 
1081 gtccaagggg ttttatttgc cgaggaacct cagctaatcg caaactgtaa agtcaatagc 
1141 agctggataa aaaaatactt aggagaggat gagtttcaag ccatttctcc taatccaatt 
1201 ctctgcgggg gcaaccatgt gctggatgcc accaaggcct ttagcctgac gttgaccaca 
1261 ccagaagaaa ttgttgggct gcccgagagt ttgctggcct tggcggctca agctgctcaa 
1321 gccgctggtg aaacagaggc aacacccgaa gccggccctt ggcgaatcac cctcgacttc 
1381 ccaagctttg 



B. Ftn6 Protein amino acid sequence (SEQ ID NO:7) 

MGTIALELIVTSTRTAVIALLERYFELSAARAAEVLQQLRSHHP 

EAWIYPATVEAIYQGRYRWVSIAQILALWQRRGQINCHFSADYERLLLGEVPEQP 

DRINVETRLPAIAMTLPWVPEQPGEAFVPAQDQSGLTERLYKTLVKAGSDCAG 



FIG. 8 



Additional Sequences 



First Set 



LOCUS BK000999 2 2 83 bp mRNA linear PLN 06 -JAN- 2 0 03 

DEFINITION CDS for rice Arc6 orthologue, predicted from AAAA01000502 . 

accession BK000999 

VERSION 
KEYWORDS 

SOURCE Oryza sativa 

ORGANISM Oryza sativa 

. Eukaryota; Viridiplantae ; Embryophyta; Tracheophyta ; 

Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; 

Oryza . 

1 (bases 1 to 2283) 

Vitha,S., Koksharova, 0 . , van Erp,H., Froehlich, J . E . and 
Os teryoung , K . W . 

Arabidopsis Arc6 : A J-Domain Plastid Division Protein Whose 
Prokaryotic Ancestors Are Unique to Cyanobacteria 
Unpublished 

2 (bases 1 to 2283) 

Vitha,S., Koksharova, O . , van Erp,H., Froehlich, J . E . and 
Osteryoung , K . W . 
Direct Submission 

Submitted (06- JAN-2003 ) Department of Plant Biology, Michigan 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
State 

USA 

FEATURES 

source 



University, 166 Plant Biology Building, East Lansing, MI 48824, 

Location/Qualifiers 
1..2283 
/organism="Oryza sativa" 
/strain= " indica cultivar-group" 
gene 1 . . 2283 

/gene="Arc6" 

/note= "Orthologue of Arabidopsis At5g42480 (Arc6) " 
CDS 1. .2283 

/gene="Arc6" 

/note="has chloroplast targeting N-terminal signal, 
followed by J domain" 
/ cpdon_start=l 
/product= "Arc6 " 

/ trans 1 at ion= "MEGFHNLLARPNSAPFAFSLPRPRPRPRRRPPPHPSAACRAASR 

WAERLFADFHLLPTAAPSDPPSPAPAPAAAPSASPFVPLFPDAAERSLPLQVDFYKVL 

GAE PHFLGDG I RRAFE AR I AKP PQ YG YS TD ALVGRRQMLQ I AHDTLMNQNS RTQ YDRA 



LSENREEALTMDIAWDKEAGEALAVLVTGEQLLLDRPPKRFKQDWLAMALAYVDLSR 



FIG. 8 continued 2/110 



DAMAASPPDVIGCCEVLERALKLLQEDGASNLAPDLLSQIDETLEEITPRCVLELLSL 

PIDTEHHKKRQEGLQGARNILWSVGRGGIATVGGGFSREAFMNEAFLRMTSIEQMDFF 

SKTPNSIPPEWFEIYNVALAHVAQAIISKRPQFIMMADDLFEQLQKFNIGSHYAYDNE 

MDLALERAFCSLLVGDVSKCRMWLGIDNESSPYRDPKILEFIVTNSSISEENDLLPGL 

CKLLETWLIFEVFPRSRDTRGMQFRLGDYYDDPEVLSYLERMEGGGASHLAAAAAIAK 

LGAQATAALGTVKSNAIQAFNKVFPLIEQLDRSAMENTKDGPGGYLENFDQENAPAHD 

SRNAALKIISAGALFALLAVIGAKYLPRKRPLSAIRSEHGSVAVANSVDSTDDPALDE 

DP VH I PRMDAKLAED I VRKWQS I KS KALGPEHS VASLQEVLDGNMLKVWTDRAAE IER 

HGWFWEYTLSDVTIDSITISLDGRRATVEATIDEAGQLTDVTEPRNNDSYDTKYTTRY 

EMAFSKLGGWKITEGAVLKS " 

BASE COUNT 551 a 576 c 592 g 564 t 

ORIGIN 

1 atggagggct 
61 cctcgcccgc 
121 gccgcgagcc 
181 ccctccgacc 
241 gtcccgctct 
301 gttctagggg 
361 atagccaagc 
421 ctgcagattg 
4 81 ctttctgaga 
541 gaggcacttg 
601 ttcaagcagg 
661 atggcagcaa 
721 ctcttgcagg 
781 actctcgagg 
841 gagcatcata 
901 ggcagaggag 
961 gcttttttga 
1021 attcctcctg 
1081 ataagtaaaa 
1141 ttcaacatag 
1201 ttctgctcat 
1261 gagtcttcac 
1321 agtgaagaga 
1381 gaggtttttc 
1441 gatgatccag 
1501 gctgctgctg 
1561 aaatcaaatg 
1621 tcagccatgg 
1681 aatgcacctg 
1741 tttgcactgt 
1801 attaggagtg 
1861 gcactagatg 
1921 gttcgcaagt 



tccacaacct 
gcccgcgccc 
gctgggccga 
cgccgtcccc 
tccccgacgc 
cagagccaca 
caccgcagta 
cccatgacac 
accgtgaaga 
ctgtgcttgt 
acgtggtgct 
gccctccaga 
aagatggagc 
agattacacc 
agaagcgcca 
gtattgctac 
ggatgacatc 
aatggtttga 
ggccacaatt 
gttctcatta 
tgctagtcgg 
catacagaga 
atgatcttct 
ctaggagcag 
aagttttaag 
ctgctattgc 
ctattcaagc 
aaaatactaa 
ctcatgattc 
tggcagtaat 
agcatggatc 
aagatccagt 
ggcagagtat 



cctcgcccgc 
gcgccgcagg 
acgcctcttc 
ggccccggcc 
cgccgaacgc 
tttccttggc 
tggctacagc 
tctcatgaac 
agctctcacc 
aactggagaa 
agcgatggct 
tgtaattggc 
aagcaatctc 
tcgctgtgta 
agaagggctt 
cgttggagga 
aattgaacag 
aatttacaat 
catcatgatg 
tgcttatgat 
agatgttagc 
ccccaaaatt 
tccagggctg 
agatactcgg 
ctacctagaa 
aaaacttggt 
gttcaacaag 
agatggccct 
gagaaatgcc 
tggggccaaa 
tgtggcagtt 
acatattcct 
caaatctaag 



cccaactcgg 
ccgccgcctc 
gccgacttcc 
ccggccgccg 
tccctcccgc 
gatggcatca 
acggatgctc 
cagaactccc 
atggatattg 
cagttgcttc 
ctggcttatg 
tgctgcgagg 
gcacctgatc 
ttggagcttc 
caaggtgcga 
ggattttctc 
atggatttct 
gtagcacttg 
gcggatgatc 
aatgagatgg 
aagtgcagaa 
ctagagttta 
tgcaagcttt 
ggcatgcagt 
aggatggagg 
gctcaagcta 
gtttttccat 

gggggatatc 

gccttgaaga 
tatttgcctc 
gctaatagtg 
agaatggatg 
gccttgggac 



cgccattcgc 
acccctccgc 
acctcctccc 
cgccctccgc 
tccaagtcga 
ggagggcgtt 
ttgttggtcg 
gcactcagta 
cttgggacaa 
tggatcggcc 
tggatctatc 
tgctcgagag 
tgctttcaca 
tctcccttcc 
gaaacatttt 
gtgaagcctt 
tttcaaaaac 
cacatgtcgc 
tttttgaaca 
accttgcatt 
tgtggcttgg 
ttgtgaccaa 
tggagacttg 
tcagacttgg 
gtggtggtgc 
cagctgcact 
tgatagaaca 
ttgaaaattt 
ttatctctgc 
gtaagaggcc 
tcgactctac 
cgaagctggc 
cagaacattc 



cttctccctc 
tgcctgccgc 
caccgccgcg 
ctcccccttc 
tttctacaag 

c g a gg cac 99 

tcgacaaatg 
tgatcgtgcg 
ggaggctggg 
acccaagcgc 

aa ggg a tgct 

ggctctcaag 
gattgatgaa 
tattgacaca 
gtggagcgtt 
catgaacgag 
accgaatagc 
tcaagcaatt 
actccagaag 
ggaaagggca 
aattgataat 
ctctagcatc 
gcttatcttt 
agattactac 
ttctcatttg 
tggtactgtg 
gttagacagg 
tgaccaggaa 
tggcgcactg 
cctttctgct 
tgatgatcct 
agaagatatt 
ggttgcatca 



ii 
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1981 ttgcaagagg ttcttgatgg caacatgcta aaggtgtgga ctgaccgagc agcggagatt 
2041 gagcgtcatg ggtggttctg ggagtataca ctatccgatg tgacgattga tagcatcact 
2101 atctccctag atggtcgacg agcgactgtg gaggctacga ttgatgaggc aggccaactt 
2161 actgatgtta ctgagcccag aaacaatgat tcatatgaca caaaatacac tacccggtat 
22 21 gagatggcct tctccaagct aggagggtgg aagataacgg aaggagcagt cctcaagtcg 
2281 tag 
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linear PLN 27-DEC- 



LOCUS BAB10489 801 aa 

2000 

DEFINITION gene_id :MDH9 . 18-pir | | S760 82 -similar to unknown protein 
[Arabidopsis 

thaliana] . 
BAB10489 

BAB10489.1 GI:9759484 
locus AB016888 "accession AB016888.1 



ACCESSION 
VERSION 
DBSOURCE 
KEYWORDS 
SOURCE 

ORGANISM 



thale cress . 
Arabidopsis thaliana 
Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; 
Tracheophyta; 

Spermatophyta; Magnoliophyta ; eudicotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 
1 (sites) 

Asamizu,E., Sato,S., Kaneko,T., Nakamura, Y., Kotani,H., 



REFERENCE 
AUTHORS 
Miyaj ima,N 



TITLE 



JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



see 



and Tabata,S. 

Structural analysis of Arabidopsis thaliana chromosome 5. VIII. 

Sequence features of the regions of 1,081,958 bp covered by 

seventeen physically assigned PI and TAC clones 

DNA Res. 5 (6), 379-391 (1998) 

99156233 

10048488 

2 (residues 1 to 801) 
Nakamura , Y . 
Direct Submission 

Submitted (18 -AUG- 1998) Yasukazu Nakamura, Kazusa DNA Research 
Institute, Department of Plant Gene Research; 1532-3, Yana, 
Kisarazu, Chiba 292-0812, Japan (E-mail :ynakamu@kazusa . or . jp , 
Tel : 81-438-52-3 935, Fax : 81-43 8 -52 -3 934 ) 
Address for correspondence: kaos@kazusa.or.jp 

For the latest information on annotation of this clone, please 



http : //www. kazusa . or. jp/kaos/cgi-bin/agd_graph . cgi?c=MDH9 
Genes with similarity to proteins in the databases are 

described in 

'product' or 'note' qualifiers. Genes that have no significant 
protein similarity are described as 'unknown protein' . 
The software programs used to predict genes include: Grail 

(Informatics Group, Oak Ridge National Laboratory, 
http://compbio.ornl.g0v/Grail-l.3/) , 
GENSCAN (Chris Burge, MIT, http://CCR- 
081.mit.edu/GENSCAN.html) , 

NetGene2 (S.M. Hebsgaard, et al . , CBS, Technical University of 
Denmark, http://www.cbs.dtu.dk/services/NetGene2/) and 
SplicePredictor (Volker Brendel, Stanford University, 
http : //gremlinl . zool . iastate . edu/ cgi -bin/ sp . cgi) . 
Genes encoding tRNAs are predicted by tRNAscan-SE 

(Sean Eddy, Washington University School of Medicine, St. 



Louis , 



http : //genome . wustl . edu/eddy/ tRNAscan-SE/ ) . 



n 
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may be 

submissions . 

FEATURES 

source 



This sequence may not be the entire insert of this clone. It 
shorter because we remove overlaps between neighboring 



Protein 



CDS 



The 5' clone is K5J14 and the 3' clone is K16E1 
Location/Qualif iers 
1. .801 

/organism="Arabidopsis thaliana" 
/strain= "Columbia" 
/db_xref ="taxon: 3702" 
/ chromosome= " 5 " 
/clone="MDH9" 
/clone_lib= "Mitsui PI" 
1. .801 

/name="gene_id:MDH9 . 18 
pir| |S76082 

similar to unknown protein" 
1..801 

/coded_by=" j oin (AB0168 8 8 . 1 : 64 077 
AB016888. 1:64666. . 64890 , AB016888 
AB016888. 1:65322. . 66309 , AB016888 
AB016 888. 1:66824. .67114) « 



.64583, 

1:64978. .65238, 
1:66599. .66732, 



ORIGIN 



kwadrllsdf nftsdsssss 
ltdgirrafe arvskppqfg 



// 



1 mealshvgig Ispfqlcrlp pattklrrsh ntstticsas 
61 fatatttatl vspppsidrp erhvpipidf yqvlgaqthf 
121 f sddalisrr qilqaacetl snprsrreyn egllddeeat vitdvpwdkv pgalcvlqeg 
181 geteivlrvg eallkerlpk sfkqdwlvm alafldvsrd amaldppdfi tgyefveeal 
241 kllqeegass lapdlraqid etleeitpry vlellglplg ddyaakrlng lsgvrnilws 
3 01 vggggasalv ggltrekfmn eaflrmtaae qvdlfvatps nipaesfevy evalalvaqa 
361 figkkphllq dadkqfqqlq qakvmameip amlydtrnnw eidfglergl calligkvde 
421 crmwlgldse dsqyrnpaiv efvlensnrd dnddlpglck lletwlagw fprfrdtkdk 
481 kfklgdyydd pmvlsylerv ewqgsplaa aaamarigae hvkasamqal qkvfpsrytd 
541 rnsaepkdvq etvfsvdpvg nnvgrdgepg vfiaeavrps enfetndyai ragvsessvd 
601 ettvemsvad mlkeasvkil aagvaiglis lfsqkyflks sssfqrkdmv ssmesdvati 
661 gsvraddsea lprmdartae nivskwqkik slafgpdhri emlpevldgr mlkiwtdraa 
721 etaqlglvyd ytllklsvds vtvsadgtra lveatleesa clsdlvhpen natdvrtytt 
781 ryevfwsksg wkitegsvla s 



fit 
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>gi | 18422214 | ref |NM_123613 . 1 | Arabidopsis thaliana putative protein, 
predicted mRNA 

ATGGAAGCTCTGAGTCACGTCGGCATTGGTCTCTCCCCATTCCAATTATGCCGATTACCACCGGCGACGA 
CAAAGCTCCGACGTAGCCACAACACCTCTACAACTATCTGCTCCGCCAGCAAATGGGCCGACCGTCTTCT 
CTCCGACTTCAATTTCACCTCCGATTCCTCCTCCTCCTCCTTCGCCACCGCCACCACCACCGCCACTCTC 
GTCTCTCCGCCACCATCTATTGATCGTCCCGAACGCCACGTCCCCATCCCCATTGATTTCTACCAGGTAT 
TAGGAGCTCAAACACATTTCTTAACCGATGGAATCAGAAGAGCATTCGAAGCTAGGGTTTCGAAACCGCC 
GCAATTCGGTTTCAGCGACGACGCTTTAATCAGCCGGAGACAGATTCTTCAAGCTGCTTGCGAAACTCTG 
TCTAATCCTCGGTCTAGAAGAGAGTACAATGAAGGTCTTCTTGATGATGAAGAAGCTACAGTCATCACTG 
ATGTTCCTTGGGATAAGGTTCCTGGTGCTCTCTGTGTATTGCAAGAAGGTGGTGAGACTGAGATAGTTCT 
TCGGGTTGGTGAGGCTCTGCTTAAGGAGAGGTTGCCTAAGTCGTTTAAGCAAGATGTGGTTTTAGTTATG 
GCGCTTGCGTTTCTCGATGTCTCGAGGGATGCTATGGCATTGGATCCACCTGATTTTATAACTGGTTATG 
AGTTTGTTGAGGAAGCTTTGAAGCTTTTACAGGAGGAAGGAGCAAGTAGCCTTGCACCGGATTTACGTGC 
ACAAATTGATGAGACTTTGGAAGAGATCACTCCGCGTTATGTCTTGGAGCTACTTGGCTTACCGCTTGGT 
GATGATTACGCTGCGAAAAGACTAAATGGTTTAAGCGGTGTGCGGAATATTTTGTGGTCTGTTGGAGGAG 
GTGGAGCATCAGCTCTTGTTGGGGGTTTGACCCGTGAGAAGTTTATGAATGAGGCGTTTTTACGAATGAC 
AGCTGCTGAGCAGGTTGATCTTTTTGTAGCTACCCCAAGCAATATTCCAGCAGAGTCATTTGAAGTTTAC 
GAAGTTGGACTTGCTCTTGTGGCTCAAGCTTTTATTGGTAAGAAGCCACACCTTTTACAGGATGCTGATA 
AGCAATTCCAGCAACTTCAGCAGGCTAAGGTAATGGCTATGGAGATTCCTGCGATGTTGTATGATACACG 
GAATAATTGGGAGATAGACTTCGGTCTAGAAAGGGGACTCTGTGCACTGCTTATAGGCAAAGTTGATGAA 
TGCCGTATGTGGTTGGGCTTAGACAGTGAGGATTCACAATATAGGAATCCAGCTATTGTGGAGTTTGTTT 
TGGAGAATTCAAATCGTGATGACAATGATGATCTCCCTGGACTATGCAAATTGTTGGAAACCTGGTTGGC 
AGGGGTTGTCTTTCCTAGGTTCAGAGACACCAAAGATAAAAAATTTAAACTCGGGGACTACTATGATGAT 
CCTATGGTTTTGAGTTACTTGGAAAGAGTGGAGGTAGTTCAGGGTTCTCCTTTAGCTGCTGCTGCAGCTA 
TGGCAAGGATTGGAGCCGAGCATGTGAAAGCTAGTGCTATGCAGGCACTGCAGAAAGTTTTTCCTTCCCG 
CTATACAGATAGAAACTCGGCTGAACCCAAGGATGTGCAAGAGACAGTGTTTAGTGTAGATCCTGTTGGT 
AACAATGTAGGCCGTGATGGTGAGCCTGGTGTCTTTATTGCAGAAGCTGTAAGACCCTCTGAAAACTTTG 
AAACTAATGATTATGCAATTCGAGCTGGGGTCTCAGAGAGTAGCGTTGATGAAAGTACTGTTGAAATGTC 
CGTTGCTGATATGTTAAAGGAGGCAAGTGTGAAGATCCTAGCTGCTGGTGTGGCAATTGGACTGATTTCA 
CTGTTCAGCCAGAAGTATTTTCTTAAAAGCAGCTCATCTTTTCAACGCAAGGATATGGTTTCTTCTATGG 
AATCTGATGTCGCTACCATAGGGTCAGTCAGAGCTGACGATTCAGAAGCACTTCCCAGAATGGATGCTAG 
GACTGCAGAGAATATAGTATCCAAGTGGCAGAAGATTAAGTCTCTGGCTTTTGGGCCTGATCACCGCATA 
GAAATGTTACCAGAGGTTTTGGATGGGCGAATGCTGAAGATTTGGACTGACAGAGCAGCTGAAACTGCGC 
AGCTTGGGTTGGTTTATGATTATACACTGTTGAAACTATCTGTTGACAGTGTGACAGTCTCAGCAGATGG 
AACCCGTGCTCTGGTGGAAGCAACTCTGGAGGAGTCTGCTTGTCTATCTGATTTGGTTCATCCAGAAAAC 
AATGCTACTGATGTCAGAACCTACACAACAAGATACGAAGTTTTCTGGTCCAAGTCAGGGTGGAAAATCA 
CTGAAGGCTCTGTTCTTGCATCATAA 

>gi | 15238978 | ref |NP_199063 . 1 | putative protein [Arabidopsis thaliana] 

MEALSHVG IGLS PFQLCRLPPATTKLRRSHNTSTT I CS AS KWADRLLSDFNFTSDS S S S S FATATTTATL 

VSPPPSIDRPERHVPIPIDFYQVLGAQTHFLTDGIRRAFEARVSKPPQFGFSDDALISRRQILQAACETL 

SNPRSRREYNEGLLDDEEATVITDVPWDKVPGALCVLQEGGETEIVLRVGEALLKERLPKSFKQDVVLVM 

ALAFLDVSRDAMALDPPDFITGYEFVEEALKLLQEEGASSLAPDLRAQIDETLEEITPRYVLELLGLPLG 

DDYAAKRLNGLSGVRNILWSVGGGGASALVGGLTREKFMNEAFLRMTAAEQVDLFVATPSNIPAESFEVY 

EVALALVAQAFIGKKPHLLQDADKQFQQLQQAKVMAJ^EIPAMLYDTRlSnsrWEIDFGL 

CRMWLGLDSEDSQYRNPAIVEFVLENSNRDDNDDLPGLCKLLETWLAGWFPRFRDTKDKKFKLGDYYDD 
PMVLSYLERVEWQGSPLAAAAAMARIGAEHVKASAMQALQKVFPSRYTDRNSAEPKDVQETVFSVX)PVG 
NNVGRDGEPGVFIAEAVRPSENFETNDYAIRAGVSESSVDETTVEMSVADMLKEASVKILAAGVAIGLIS 
LFSQKYFLKSSSSFQRKDMVSSMESDVATIGSVRADDSEALPRMDARTAENIVSKWQKIKSLAFGPDHRI 
EMLPEVLDGRMLKIWTDRAAETAQLGLVYDYTLLKLSVDSVTVSADGTRALVEATLEESACLSDLVHPEN 
NATDVRTYTTRYEVFWSKSGWKITEGSVLAS 
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>gi | 20259550 | gb | AY091075 . 1 | Arabidopsis thaliana unknown protein 
(At5g42480) mRNA, complete cds 

GATTTAACTTATACTACTCAAAATCAAAATTCCATAAACCCTAGACGACCAAACAGTCTCTTCAATATGT 
AAAACAGAACAAAGTTTTTGTAGTAGCCTAAAAAGACACTCCCATGGAAGCTCTGAGTCACGTCGGCATT 
GGTCTCTCCCCATTCCAATTATGCCGATTACCACCGGCGACGACAAAGCTGCGACGTAGCCACAACACCT 
CTACAACTATCTGCTCCGCCAGCAAATGGGCCGACCGTCTTCTCTCCGACTTCAATTTCACCTCCGATTC 
CTCCTCCTCCTCCTTCGCCACCGCCACCACCACCGCCACTCTCGTCTCTCCGCCACCATCTATTGATCGT 
CCCGAACGCCAGGTCCCCATCCCCATTGATTTCTACCAGGTATTAGGAGCTCAAACACATTTCTTAACCG 
ATGGAATCAGAAGAGCATTCGAAGCTAGGGTTTCGAAACCGCCGCAATTCGGTTTGAGCGACGACGCTTT 
AATCAGCCGGAGACAGATTCTTCAAGCTGCTTGCGAAACTCTGTCTAATCCTCGGTCTAGAAGAGAGTAC 
AATGAAGGTCTTCTTGATGATGAAGAAGCTACAGTCATCACTGATGTTCCTTGGGATAAGGTTCCTGGTG 
CTCTCTGTGTATTGCAAGAAGGTGGTGAGACTGAGATAGTTCTTCGGGTTGGTGAGGCTCTGCTTAAGGA 
GAGGTTGCCTAAGTCGTTTAAGCAAGATGTGGTTTTAGTTATGGCGCTTGCGTTTCTCGATGTCTCGAGG 
GATGCTATGGCATTGGATCCACCTGATTTTATAACTGGTTATGAGTTTGTTGAGGAAGCTTTGAAGCTTT 
TACAGGAGGAAGGAGCAAGTAGCCTTGCACCGGATTTACGTGCACAAATTGATGAGACTTTGGAAGAGAT 
CACTCCGCGTTATGTCTTGGAGCTACTTGGCTTACCGCTTGGTGATGATTACGCTGCGAAAAGACTAAAT 
GGTTTAAGCGGTGTGCGGAATATTTTGTGGTCTGTTGGAGGAGGTGGAGCATCAGCTCTTGTTGGGGGTT 
TGACCCGTGAGAAGTTTATGAATGAGGCGTTTTTACGAATGACAGCTGCTGAGCAGGTTGATCTTTTTGT 
AGCTACGCCAAGCAATATTCCAGCAGAGTCATTTGAAGTTTACGAAGTTGCACTTGCTCTTGTGGCTCAA 
GCTTTTATTGGTAAGAAGCCACACCTTfTACAGGATGCTGATAAGCAATTCCAGCAACTTCAGCAGGCTA 
AGGTAATGGCTATGGAGATTCCTGCGATGTTGTATGATACACGGAATAATTGGGAGATAGACTTCGGTCT 
AGAAAGGGGACTCTGTGCACTGCTTATAGGCAAAGTTGATGAATGCCGTATGTGGTTGGGCTTAGACAGT 
GAGGATTCACAATATAGGAATCCAGCTATTGTGGAGTTTGTTTTGGAGAATTCAAATCGTGATGACAATG 
ATGATCTCCCTGGACTATGCAAATTGTTGGAAACCTGGTTGGCAGGGGTTGTCTTTCCTAGGTTCAGAGA 
CACCAAAGATAAAAAATTTAAACTCGGGGACTACTATGATGATCCTATGGTTTTGAGTTACTTGGAAAGA 
GTGGAGGTAGTTCAGGGTTCTCCTTTAGCTGCTGCTGCAGCTATGGCAAGGATTGGAGCCGAGCATGTGA 
AAGGTAGTGCTATGCAGGCACTGCAGAAAGTTTTTCCTTCCCGCTATACAGATAGAAACTCGGCTGAACC 
CAAGGATGTGCAAGAGACAGTGTTTAGTGTAGATCCTGTTGGTAACAATGTAGGCCGTGATGGTGAGCCT 
GGTGTCTTTATTGCAGAAGCTGTAAGACCCTCTGAAAACTTTGAAACTAATGATTATGCAATTCGAGCTG 
GGGTCTCAGAGAGTAGCGTTGATGAAACTACTGTTGAAATGTCCGTTGCTGATATGTTAAAGGAGGCAAG 
TGTGAAGATCCTAGCTGCTGGTGTGGCAATTGGACTGATTTCACTGTTCAGCCAGAAGTATTTTCTTAAA 
AGCAGCTCATCTTTTCAACGCAAGGATATGGTTTCTTCTATGGAATCTGATGTCGCTACCATAGGGTCAG 
TCAGAGCTGACGATTCAGAAGCACTTCCCAGAATGGATGCTAGGACTGCAGAGAATATAGTATCCAAGTG 
GCAGAAGATTAAGTCTCTGGCTTTTGGGCCTGATCACCGCATAGAAATGTTACCAGAGGTTTTGGATGGG 
CGAATGCTGAAGATTTGGACTGACAGAGCAGCTGAAACTGCGCAGCTTGGGTTGGTTTATGATTATACAC 
TGTTGAAACTATCTGTTGACAGTGTGACAGTCTCAGCAGATGGAACCCGTGCTCTGGTGGAAGCAACTCT 
GGAGGAGTCTGCTTGTCTATCTGATTTGGTTCATCCAGAAAACAATGCTACTGATGTCAGAACCTACACA 
ACAAGATACGAAGTTTTCTGGTCCAAGTCAGGGTGGAAAATCACTGAAGGCTCTGTTCTTGCATCATAAT 
ATACTCATATGTAGCATGTCTGAGCTTGCGAGATTCTCTTTGTTTTGTAAATTCTCTCTCTAAGTTAGTG 
TTTATAAATGAACACAAAAAAATTAACGTTCAAAAAAAAAAAAAAAA 
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LOCUS AAM13895 801 aa 

2002 

DEFINITION unknown protein [Arabidopsis thaliana] . 
AAM13895 

AAM13 8 95. 1 GI : 2 02 59551 
accession AY091075.1 



linear PLN 21-APR- 



ACCESSION 
VERSION 
DBSOURCE 
KEYWORDS 

SOURCE thale cress. 

ORGANISM Arabidopsis thaliana 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; 
Tracheophyta; 

Spermatophyta ; Magnoliophyta; eudicotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae; Arabidopsis. 
REFERENCE 1 (residues 1 to 801) 

AUTHORS Yamada,K., Banh,J., Chan,M.M., Chang , C.H., Chang, E., Dale, J. M., 
Deng, J, M. , Goldsmith, A. D . , Lee, J. ML, Onodera, C . S . , Quach, H . L . , 
Tang,C, Toriumi,M., Wu,H.C, Yamamura , Y . , Yu,G., Bowser, L., 
Carninci,P., Chen,H., Cheuk,R., Hayashizaki , Y . , Ishida,J., 
Jones, T. , Kamiya,A., Kar 1 in -Neumann, G . , Kawai,J., Kim,C, 



Lam, B . , 
Palm, C . J. , 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 



Lam, B . , 
Palm, C.J. 



TITLE 
JOURNAL 
Buchanan 

COMMENT 



Ishida, J. 
the 

Banh, J. , 



Lin, J., Meyers, M.C., Miranda, M. , Narusaka,M., Nguyen, M., 

Sakurai,T., Satou,M., Seki,M., Shinn,P., Southwick,A. , 

Shinozaki , K. , Davis, R.W., Ecker,J.R. and Theologis,A. 

Arabidopsis Full Length cDNA Clones 

Unpublished 

2 (residues 1 to 801) 

Yamada,K., Banh,J., Chan, M . M .,. Chang, C . H . , Chang, E., Dale,J.M., 
Deng, J. M. , Goldsmith, A. D . , Lee,J.M., Onodera, C . S . , Quach, H.L., 
Tang,C.C, Toriumi,M., Wu,H.C, Yamamura, Y., Yu,G., Bowser, L., 
Carninci,P., Chen,H., Cheuk,R., Hayashizaki , Y . , Ishida, J., 
Jones, T., Kamiya,A., Karlin-Neumann, G . , Kawai,J., Kim,C, 

Lin, J., Meyers, M.C., Miranda, M. , Narusaka,M., Nguyen, M., 

Sakurai,T., Satou,M., Seki,M., Shinn,P., Southwick,A. , 
Shinozaki , K. , Davis, R.W., Ecker,J.R. and Theologis,A. 
Direct Submission 

Submitted (2 l-MAR-2002 ) Plant Gene Expression Center, 800 
Street, Albany, CA 94710, USA 

RIKEN Genomic Sciences Center (GSC) members carried out the 
collection and clustering of RAFL cDNAs (RAFL cDNA : 'RIKEN 
Arabidopsis Full-Length cDNA 1 ) : Seki,M. , Narusaka,M., 

Satou,M., Kamiya,A., Sakurai,T., Carninci,P., Kawai,J., 
Hayashizaki , Y. and Shinozaki, K. 

The Salk, Stanford, PGEC (SSP) Consortium members carried out 

sequencing and annotation of the RAFL cDNAs : Yamada,K., 

Chan,M.M., Chang, C.H., Chang, E., Dale,J.M., Deng, J. M., 
Goldsmith, A. D. , Lee,J.M., Onodera , C . S . , Quach, H.L., Tang ,0.0., 
Toriumi,M., Wu,H.C, Yamamura, Y. , Yu,G., Bowser, L., Chen,H. , 
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equally to 
(SSP/PGEC) 



FEATURES 

source 



Protein 



CDS 



Cheuk,R., Jones, T., Karlin-Neumann, G . , Kim,C, Lam,B., Lin, J., 
Meyers , M . C. , Miranda, M., Nguyen, M., Palm, C. J., Shinn,P., 
Soiithwick, A. , Davis,R.W., Ecker,J.R. and Theologis,A. 

Yamada,K. (SSP/PGEC) and Seki,M. (RIKEN GSC) contributed 

this work. Shinozaki,K. (RIKEN GSC) and Theologis,A. 

contributed equally to this work as Pis. 
Method: conceptual translation. 

Location/Qualifiers 

1. .801 

/organism= M Arabidopsis thaliana" 
/ db_xr e f = " t axon : 3 7 0 2 " 
/chromosome= " 5 11 

/clone="RAFL09-76-Gll (R19395) " 

/note="This clone is in a modified pBluescript vector 
(FLC-1) as a BamHl/XhoI insert, 
ecotype : Columbia" 
1. . 801 

/product =" unknown protein" 
1. .801 

/gene= M At5g42480" 

/coded_by="AY091075 . 1 : 114 . .2519" 



ORIGIN 



// 



1 mealshvgig lspfqlcrlp pattklrrsh ntstticsas kwadrllsdf nftsdsssss 
61 fatatttatl vspppsidrp erhvpipidf yqvlgaqthf ltdgirrafe arvskppqfg 
121 fsddalisrr qilqaacetl snprsrreyn egllddeeat vitdvpwdkv pgalcvlqeg 
181 geteivlrvg eallkerlpk sfkqdwlvm alafldvsrd amaldppdfi tgyefveeal 
241 kllqeegass lapdlraqid etleeitpry vlellglplg ddyaakrlng lsgvrnilws 
301 vggggasalv ggltrekfmn eaflrmtaae qvdlfvatps nipaesfevy evalalvaqa 
361 figkkphllq dadkqfqqlq qakvmameip amlydtrnnw eidfglergl calligkvde 
421 crmwlgldse dsqyrnpaiv efvlensnrd dnddlpglck lletwlagw fprfrdtkdk 
481 kfklgdyydd pmvlsylerv ewqgsplaa aaamarigae hvkasamqal qkvfpsrytd 
541 rnsaepkdvq etvfsvdpvg nnvgrdgepg vfiaeavrps enfetndyai ragvsessvd 
601 ettvemsvad mlkeasvkil aagvaiglis If sqkyf Iks sssfqrkdmv ssmesdvati 
661 gsvraddsea lprmdartae nivskwqkik slafgpdhri emlpevldgr mlkiwtdraa 
721 etaqlglvyd ytllklsvds vtvsadgtra lveatleesa clsdlvhpen natdvrtytt 
781 ryevfwsksg wkitegsvla s 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 



AI998415 
5845320 



3126415 
701545606 



CLONE INFO 
Clone Id: 
Source : 



701545606 

Genome Systems, Inc., a wholly owned subsidiary of Incyte 
cDNA 



DNA type : 



PRIMERS 
PolyA Tail : 



Unknown 



SEQUENCE 

ATAAACACTAACTTAGAGAGAGAATTTACAAAACAAAGAGAATCTCGCAAGCTCAGACAT 
GCTACATATGAGTATATTATGATGCAAGAACAGAGCCTTCAGTGATTTTCCACCCTGACT 
TGGACCNGAAAACTTCGTATCTTGTTGTGTAGGTTCTGACATCAGTAGCATTGTTTTCTG 
GATGAACCAAATCAGATAGACAAGCAGACTCCTCCAGAGTTGCTTCCACCAGAGCACGGG 
TTCCATCTGCTGAGACTGTCACACTGTCAACAGATAGTTTCAACAGTGTATAATCATAAA 
CCAACCCAAGCTGCGCAGTTTCAGCTGCTCTGTCAGTCCAAATCTTCAGCATTCGCCCAT 
CCAAAACCTCTGGTAACATTTCTATGCGGTGATCAGGCCCAAAAGCCAGAGACTTAATCT 
TCTGCCACTTGGATACTATATTCTCTGCAGTCCTAGCATCCATTCTGGGAAGTGCTTCTG 
AATCGTCAGCTCTGACTGACCCTATGGTAGCGACATCAGNTTCCATAGAAGAAACCATAT 



NCTTGCGTTGAAAAGATGAGC 



Entry Created: Sep 7 1999 
Last Updated: Sep 8 1999 



LIBRARY 

Lib Name: A. thaliana, Columbia Col-0, rosette-2 

Organism: Arabidopsis thaliana 

Cultivar: Columbia Col-0 

Tissue type: rosette 

Develop, stage: 4-7 weeks 

Vector: pSPORT 

R. Site 1: NotI 

R . Site 2: Sail 

Description: cDNA library was derived from untreated rosette tissue from 




deg. C +/- 3 deg. C under constant light, and watered with 
fertilizer. cDNA synthesis was initiated using a 
Notl-oligo (dT) primer. Double -stranded cDNA was blunted, 



15 
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ligated to Sail adaptors, digested with NotI, size- 
selected, 

and cloned into the NotI and Sail sites of the pSPORT 
vector . 



David Smoller, Ph.D. 

Genome Systems, Inc., a wholly owned subsidiary of Incyte 
Pharmaceuticals, Inc. 

4633 World Parkway Circle, St. Louis, MO 63134, USA 

877-577-2733 

314-427-3324 

service@genomesystems . com 



SUBMITTER 
Name : 

Institution: 

Address : 
Tel : 
Fax: 
E-mail: 



CITATIONS 
Title: 
Authors : 

Doyle 



Turner , C . , 

Year: 
Status : 



Arabidopsis thaliana Gene Expression MicroArray 
Chen, J., Momiyama,M., Chan,E., Mooney,M. , Carroon,B., 
Gilliland, D . , Wang,X., Hillman,J., Guegler,K., Kim,C, 

,M., Brzoska,P., Gorgone,G. , Burns, D., Griffin, J., 
Mouanoutoua, M. , Nguyen, D., Tan,R., Rose,M., Warren, B., Ton 
,B., Kastury,K., Borillo, C . , , Carpio, T . , Policky,J., Suzuki 
,G., Argentine, C. , Shah,S«, Nobriga,A., Murry,L., 

Krikorian, S . , Elder, L . , Hanson, D. 
1999 

Unpublished 
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dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
Clone Id: 
DNA type: 

PRIMERS 
PolyA Tail : 

SEQUENCE 



5659606 
MtBC10F12Fl 
AL382914 
9682665 



MtBC10F12 (T3! 
CDNA 



Unknown 



CTGGTGTAGCAATTGGACTCATAACTTTAGCTGGTTTGAAGATTTTACCTTCTAAAAATG 

GCTCGCCCGTTCTTCACAAAGTGACTGGTTCAGCAATTGCGTCAGATACTATCAATTTAG 

GTCCTGTAGGAGATGAAGAATTAGGAGAGCAACTACCAAAAATGAGTGCAATGGTTGCAG 

AAGCTCTAGTCCGCAAGTGGCAATATATCACATCCCAAGCTTTTGGACCTGACCATTGCC 

TAGGAAGATTGCAAGAGGTGTTGGACGGCCAAATGTTGAAGATATGGACTGATCG 



Entry Created: 
Last Updated: 



COMMENTS 



Aug 3 2 0 00 
Aug 3 2 000 



Contact ': Pascal Gamas and Etienne-Pascal Journet , 
Laboratoire de Biologie Moleculaire des Relations 
Plantes -Microorganismes , CNRS-INRA, BP 27 31326 
Castanet-Tolosan Cedex, France (Email : 
Mt-est@toulouse.inra.fr Website : 

http : //sequence . toulouse . inra . f r/Mtruncatula . html ) 



LIBRARY 

Lib Name : 

Organism: 

Cultivar : 

Tissue type: 

Develop, stage: 

Vector: 

R . Site 1: 

R. Site 2: 

Description: 

25 



but 



MtBC 

Medicago truncatula 
Jemalong 

arbuscular mycorrhiza 

harvested 3 weeks post inoculation with Glomus intraradices 

pBluescript pSK 

EcoRI 

Xhol 

M. truncatula sterilised seeds were germinated for 72h at 

C, before transplanting into a 1/3 Epoisses soil : 2/3 
calcined Terragreen mix in the presence of onion root 
fragments colonized by the arbuscular mycorrhizal fungus 
Glomus intraradices (Schenck & Smith, isolate LPA8) . The 
plants were watered every day and twice a week with a 
modified nutrient Long Ashton solution without phosphate 

with a high level of nitrate. After 3 weeks RNA was 
extracted from whole root systems. cDNA was prepared from 



41 
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into 



SUBMITTER 
Name : 

Institution: 
Address : 
E-mail: 

CITATIONS 
Title: 
Authors : 



Year: 
Status : 



polyA+ enriched RNA. The cDNA was directionally ligated 

Uni-zap XR vector from Stratagene and packaged using 
Gigapack Gold packaging extracts. Plasmids containing cDNA 
inserts were mass -excised from phage stocks using ExAssit 
helper phage and propagated in SOLR cells. Clone ordering 
and sequencing was performed by the Centre National de 
Sequencage (Genoscope, Evry, France) . Note : EST may be of 
fungal origin. 



Genoscope 

Genoscope - Centre National de Sequencage 
BP 191 91006 EVRY cedex - France 

seqref@genoscope.cns.fr, Web : www.genoscope.cns.fr 

Medicago truncatula ESTs from endomycorrhizal roots 
Journet, E . P. , Crespeau,H. , van-Tuinen, D . , Gouzy,J., Jaillon 
,0., Niebel,A., Carreau,V. , Chatagnier , 0 . , Kahn,D., 
Gianinazzi-Pearson, V. , Gamas,P. 
2000 

Unpublished 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 



5659607 
MtBC10F12Rl 
AL382915 
9682666 



CLONE INFO 
Glone Id: 
DNA type : 



MtBC10F12 (T7) 
CDNA 



PRIMERS 

PolyA Tail: Unknown 
SEQUENCE 



CCCAAGCTTTTGGACCTGACCATTGCCTAGGAAGATTGCAAGAGGTGTTGGACGGCGAAA 
TGTTGAAGATATGGACTGATCGAGCAGCTGAGATTGCAGAGCTTGGTTGGTCATATGACT 



ACAACTTGGAGGATCTCAACATCGACAGTGTGACCATATCACAGAATGGGCGGCGTGCAG 
TAGTGGAAACAACTCTCAAAGAGTCTACCCACCTCACTGCTGTTGGTCATCCACAGCATG 
CTACTTCCAACAGCAGAACCTACACAACAAGATATGAAATGTCTTTTTCAGATTCAGGGT 



GGAAAATTATTGAAGGAGCTGTCCTTGAGTCGTAATTAGGTTTTGTAATATGTAATATAT 

GTCAGGTTAGTACACTTCAATATTAACCCCCTCGAGCCTATGCCCACTGTCTTGTATGTA 

CCTGTTGTTTTGTGCATTTTTCAAGCATTTATGTAGTCAGGCTGTAAATACTTGGAGGGT 

ATTTGATCAAATAATTATCCGGTTAAAAAAAAAAAAAAAAAAAAAAA 

Aug 3 2000 
Aug 3 2000 



Entry Created: 
Last Updated: 



COMMENTS 

Contact : Pascal Gamas and Etienne- Pascal Journet, 
Laboratoire de Biologie Moleculaire des Relations 
Plantes-Microorganismes, CNRS - INRA , BP 27 31326 
Castanet-Tolosan Cedex, France (Email : 
Mt-est@toulouse.inra.fr Website : 

http : / /sequence . toulouse . inra . f r/Mtruncatula . html ) . 



LIBRARY 
Lib Name : 
Organism: 
Cultivar : 
Tissue type: 
Develop, stage: 
Vector : 
R. Site 1 : 
R. Site 2 : 
Description : 
25 



MtBC 

Medicago truncatula 
Jemalong 

arbuscular mycorrhiza 

harvested 3 weeks post inoculation with Glomus intraradices 

pBluescript pSK 

EcoRI 

Xhol 

M. truncatula sterilised seeds were germinated for 72h at 



C, before transplanting into a 1/3 Epoisses soil : 2/3 
calcined Terragreen mix in the presence of onion root 



J7 
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but 



into 



SUBMITTER 
Name : 

Institution : 
Address : 
E-mail : 

CITATIONS 
Title: 
Authors : 



Year: 
Status : 



fragments colonized by the arbuscular mycorrhizal fungus 
Glomus intraradices (Schenck & Smith, isolate LPA8) . The 
plants were watered every day and twice a week with a 
modified nutrient Long Ashton solution without phosphate 

with a high level of nitrate. After 3 weeks RNA was 
extracted from whole root systems. cDNA was prepared from 
polyA+ enriched RNA. The cDNA was directional ly ligated 

Uni-zap XR vector from Stratagene and packaged using 
Gigapack Gold packaging extracts. Plasmids containing cDNA 
inserts were mass-excised from phage stocks using ExAssit 
helper phage and propagated in SOLR cells. Clone ordering 
and sequencing was performed by the Centre National de 
Sequencage (Genoscope, Evry, France) . Note : EST may be of 
fungal origin. 



Genoscope 

Genoscope - Centre National de Sequencage 
BP 191 910 06 EVRY cedex - France 

seqref ©genoscope . ens . fr, Web : www.genoscope.cns.fr 

Medicago truncatula ESTs from endomycorrhizal roots 
Journet , E r P . , Crespeau,H., van-Tuinen, D . , Gouzy , J . , Jaillon 
,0., Niebel,A., Carreau,V., Chatagnier , 0 . , Kahn,D., 
Gianinazzi-Pearson, V. , Gamas , P . 
2000 

Unpublished 
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dbEST Id: 
EST name : 
GenBank Acc: 
GenBank gi : 

CLONE INFO 
Clone Id: 
Insert length: 
Plate: 
DNA type : 



9071332 

NF119C11IN1F1086 

BI268376 

14874230 



NF119C11IN (5') 
660 

119 Row: C Column: 
CDNA 



11 



PRIMERS 
Sequencing: 
PolyA Tail : 



TCACACAGGAAACAGCTATGAC 
Unknown 



SEQUENCE 



CACGCTTCTCCAAAAAACCTAACCGTCTCCATTCCTCCGCCGTCTCCGCCACCAGTAAAT 
GGGCGGAGCGACTCATTTCCGATTTCCAATTCCTCGGCGACACCTCCTCTTCCTCCTCCA 
CCACCACCTCCGCCACAGTCACTCTCACTCCTTCTTACCCTCCTCCGATAGAACGCCACG 
TGTCACTCCCTCTCGACCTGTACAAAATCCTCGGCGCCGAAACGCATTTTCTCGGTGATG 
GTATTCGGAGAGCTTATGAAGCGAAATTCTCGAAGCCTCCTCAGTATGCTTTCAGTAATG 
AAGCTTTGATTAGTCGTCGTCAGATTCTTCAAGCTGCTTGTGAAACCCTAGCTGATCCTG 
CTTCTAGAAGAGAGTATAATCAAAGCCTCGTCGACGATGAAGACGAAGATGAGGAATCTT 
CCATTCTCACTGAAATCCCTTTCGACAAAGTTCCTGGAGCTCTGTGCGTGTTGCAAGAAG 
CTGGAGAGACGGAGTTGGTGCTTCGGATTGGAGGGGGTTTACTGAGAGAGAGGTTACCGA 
AGATGTTTAAGCAAGATGTTGTGTTGGCTATGGCGCTTGCATATGTTGACGTTTCTAGGG 



ATGCTATGGCTTTGTCCCCGCCAGATTTCATTGTTGCTTGTGAGATGCTGGAAAGGGCAT 

Entry Created: Jul 18 2 001 
Last Updated: Jul 18 2001 

LIBRARY 

Lib Name: Insect herbivory 

Organism: Medicago truncatula 

Tissue type: local and systemic leaves 
Develop, stage: mature 
Vector: Lambda Zap 

Description: Library was produced from fully expanded M. truncatula 
leaves of plants fed upon by Spodoptera exigua (beet 
armyworm) for 24 hours. Systemic (undamaged leaves from 
injured plants) and wounded leaves were harvested and 
pooled. 



SUBMITTER 
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Name : 
Lab: 

Institution: 
Address : 
Tel : 
Fax: 
E-mail : 



Korth K 

Dept . of Plant Pathology 
University of Arkansas 

217 Plant Science Building, Fayetteville , AR 72701, USA 
501 575 5191 
501 575 7601 
kkorth@comp . uark . edu 



CITATIONS 
Title: 

Authors : 
Bell , C . J. , 

Year: 
Status : 



Expressed Sequence Tags from the Samuel Roberts Noble 
Foundation Medicago truncatula insect herbivory library 
Korth, K. , Scott,A.D., Harris, A. R. , Gonzales , R .A. , 



Flores,H.R. , 
2000 

Unpublished 



Inman,J.T., Weller,J.W., May , G . D . 
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dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 



3883556 
si29ell .yl 
AW472683 
7042789 



CLONE INFO 
Clone Id: 
Source : 

Insert length: 
DNA type: 



GENOME SYSTEMS CLONE ID: Gm- rl03 0 - 3 57 (5') 

ResGen, Invitrogen Corp. 

609 

cDNA 



PRIMERS 
PolyA Tail : 

SEQUENCE 



Unknown 



AGCGTTGTGTGTGTTGCAGGAAGCTGGAGAGACGGAGCTTGTGCTTGAGATTGGGCAGGG 

TTTGCTTAGGGAGAGGTTGCCGAAGACGTTTAAGCAGGATGTTGTGTTGGCTATGGCACT 

CGCATTTGTTGACGTGTCAAGGGATGCTTGGCTTGTTCACCGGATTTCATTGCGGCTGTG 
AGATGCT 



Entry Created: 
Last Updated: 



COMMENTS 



further 



Feb 23 2000 
Dec 3 2001 



This clone is available through: ResGen, Invitrogen Corp. 
2130 South Memorial Parkway Hunt svi lie, AL 3 58 01 For 

information call: (800) -533 -4363 or contact via email: 
ccu@resgen.com 



LIBRARY 
Lib Name : 
Organism: 
Lab host: 
Vector : 
R. Site 1: 
R. Site 2 : 
Description: 



site . 



Gm-rl030 

Glycine max 

DH10B 

pSPORTl 

Sail 

Not I 

This cDNA library was constructed from mRNA isolated from 
immature cotyledons of greenhouse grown plants (individual 
seed fresh weight of 100-300mg) . The library was prepared 
using the Life Technologies pSuperScript cDNA library 
construction kit. Complementary DNA was synthesized from 
mRNA using a poly(dT) sequence with a NotI restriction 

Sail linkers adapters were ligated to the blunt-ended cDNA 
fragments followed by NotI digestion. The cDNA fragments 
were directionally cloned into the Notl-Sall restriction 
site of the pSPORTl vector. The ligated cDNA fragments were 
transformed into E. coli ElectroMax DH10B host cells. This 
library was constructed by Dr. Lila Vodkin and Dr. Anu 
Khanna. Note that Gm-rl030 is a re-rack of Gm-cl007. 



3^ 
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SUBMITTER 
Name : 
Lab: 

Institution : 

Address : 

USA 

Tel : 

Fax: 

E-mail : 



Shoemaker R/Public Soybean EST Project 

Public Soybean EST Project 

Washington University School of Medicine 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108, 

314 286 1800 
314 286 1810 
est@watson . wustl . edu 



CITATIONS 
Title: 
Authors : 

Martin 



Gibbons , M . , 



Year: 
Status : 



Public Soybean EST Project 

Shoemaker , R. , Keim,P., Vodkin,L., Erpelding, J. , Coryell, V. 
Khanna,A., Bolla,B., Marra,M., Hillier,L., Kucaba,T., 

,J., Beck,C, Wylie,T., Underwood, K . , Steptoe,M., Theising 
,B., Allen, M., Bowers, Y., Person, B . , Swaller,T., 

Pape,D., Harvey, N. , Schurk,R., Ritter,E., Kohn,S., Shin,T. 
Jackson, Y., Cardenas, M. , McCann,R., Waterston, R . , Wilson, R 
1999 

Unpublished 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 



BE472035 
9562526 



5570813 
EST416888 



CLONE INFO 
Clone Id: 
Source : 



CSTA31L21 

Cornell University 
cDNA 



DNA type : 



PRIMERS 

PolyA Tail: Unknown 
SEQUENCE 

GGAAAGCTTCCTTAACAATGGAGGCATTAACACAGCTAAGCTTTGGCATTTGTACTCCAC 

GCCTTTCATCACCATTTCAACTAGCCGCCGCCGGTGGTAAGAAGCCGCCGAGACTCAATG 

CCGTTAACGGAGGAGCTAGTAGTGTTACCGGTGGAACAAGTAGTTTACCTACTAACTTCT 

CCGCTAGTAAATGGGCGGATCGTCTTCTCGCCGATTTCCAATTCCTTCCTTCCACCACCA 

CCTCCGACTCATCGGATTTCCAGAATTCAACTTCTACAACCTCCGTTACGACTATTCCTC 

CTCCTGTTGCTCCTTCAGACCACCACATTTCAATGCCTATAGACTTTTATAGA'GTGCTTG 

GTGCTGAAGCTCACTTCCTCGGTGACGGTATTAGGAGATGCTACGATGCTAGAATTACAA 

AGCCTCCGCAGTACGGATACAGTCAGGAAGCATTGATTGGCCGACGGCAGATTCTTCAAG 

CTGCTTGTGAAACCCTTGCTGACTCTACCTCTCGTAGAGAGTACAATCAAGGCCTCGCTC 

AGCATGAGTTCGATACTATTCTAACTCCTGTCCCCTGGGATAAAGTTCCGGGAGCAATGT 
GTGTTTTG 

Entry Created: Jul 28 2000 
Last Updated: Jul 2 8 2000 

COMMENTS 



5 prime sequence 



LIBRARY 



Lib Name: 
Organism: 
Cultivar : 



potato stolon, Cornell University 
Solanum tuberosum 
Bint je 

axillary buds of stem explants, swelling • stolons 

1 to 3 days 

SOLR 

pBlueScript SK(-) 

EcoRl 

Xhol 

RNA was supplied by Christian Bachem & Beatrix 
Horvath (Laboratory of Plant Breeding, Dept. of Plant 



Tissue type: 
Develop, stage: 



Lab host: 
Vector: 



R. Site 1: 
R. Site 2: 



Description : 
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RNA 



Journal 



SUBMITTER 
Name : 
Tel : 
E-mail : 

CITATIONS 
Title: 
Authors : 



Year : 
Status : 



Sciences, Wageningen University, The Netherlands) . Total 

was isolated from developing axillary buds of potato nodal 
stem cuttings cultured on medium for the introduction of 
tuber formation as described in Bachem et al . (Plant 

1996) . Tissue samples were taken of stages corresponding to 
growing stolons and the early stages of tuber formation. 

Research Genetics, Libraries Division 

1-800-711-6195 

cdna@resgen.com 

Generation of ESTs from potato swelling stolons 

van der Hoeven,R., Bezzerides , J . , Bachem,C, Horvath,B., 

Visser,R., Holt, I.E., Liang, F., Hansen, T.S., Utterback, T . , 

Bowman ,C.L., Doan,B., Bougri,0., Buell,C.R., Ronning , C . M . , 

Tanksley, S .D . , Baker , B . 

1999 

Unpublished 
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8892494 
F013P64Y 
BI120337 
18004312 



cDNA 



dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
DNA type : 

PRIMERS 

PolyA Tail: Unknown 
SEQUENCE 

GAAGATTTCATGAATGAGGCCTTCTTACGTATGACAGCAGCTGAGCAGGTTGATCTGTTC 



GTCACCACGCCAAGTAATATCCCGGCTCAAAATTTTGAAGTTTATGGAGTGGCACTTGCC 

CTTGTTGCCCAAGCTTTCATTGGTAAAAAGCCTCATCTCATCACAGATGCTGATAACCTA 

TTCGGACAGCTTCAGCAGATTAAGGTAACAAATCAAGGGAGTCTTGTTCCTGTCTTTGGT 

TCCATGGAAAACCGTGATATTGACTTTGGGTTGGAGAGGGGCTTTGTTCACTGCTTGTAG 
GCCAGCT 



Entry Created: 
Last Updated 

LIBRARY 
Lib Name : 
Organism : 
Organ: 

SUBMITTER 
Name : 
Lab: 

Institution : 
Address : 
Tel : 
Fax: 
E-mail : 



Dec 31 2001 
Dec 31 2001 



Populus flower cDNA library 

Populus balsamifera subsp. trichocarpa 

flower 



Erlandsson R 

Department of Biotechnology 
Royal Institute of Technology 
Teknikringen 30, Stockholm S- 10 044, Sweden 
46 8 790 8287 
46 8 245452 
rikerl@biochem.kth. se 



CITATIONS 
Title: 
Authors : 



Year : 
Status : 



Gene expression in Populus 

Hertzberg, M . , Aspeborg,H., Erlandsson, R . , Bj orkbacka , H . , 
Hiltonen,T., Karlsson,J., Teeri,T., Gustaf sson, P . , Bahlerao 
,R., Jansson,S., Nilsson,0., Sundberg,B., Nilsson,P., Uhlen 
,M., Sandberg,G. , Lundeberg,J. 
2001 

Unpublished 



3 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
Clone Id: 
Id as DNA: 
Id in host : 
Insert length: 
Plate: 
DNA type: 

PRIMERS 
PCR forward: 
PCR backward: 
Sequencing : 
PolyA Tail: 



1782844 
L30-504T3 
AI043508 
3290291 



L30-504 (5') 
L30-6A504 
L30-6A504 
434 

L30-6 Row: A Column: 
CDNA 



T7 
T3 
T3 

Unknown 



12 



SEQUENCE 

GGGAAACGTGCCTTGGTGGAAGCAACTCTTCAAGAATCAGCGCAGTTAACTGACGTTAAC 
CAACCTGAGCATAACGATTCTTACAGCAGAACATACACAACAAGGTACGAGATGTTTCAC 
TCCAATGCTGGGTGGAAGATCATAGAGGGAGCTGTCCTCCAATCTTAAGCTGCTGGAAAT 
CCAGTCTTGAATGTACATATTTTCACATCATCTGCACATTATGAATGAAGGATGGTATGT 
GTTTTCTGGACAGTGGTATTTGATCATGTTGTGTTTATTTTGGTAACAAGTTTTGATCAT 



TATCAAAAAGATCACTCTTGTAAGTTAGTTTTTTCCACAATAAATCAACTATTTATATGA 

AAGTTTTTATATCAGGACTACTTGCCTTTACTTATATAAACTTTGAGAAATTTTTT 
High quality sequence stops at base: 350 



Quality: 

Entry Created: 
Last Updated: 
COMMENTS 

LIBRARY 
Lib Name : 

Organism: 
Tissue type: 
Develop, stage: 
Vector : 
R. Site 1: 
R. Site 2 : 
SUBMITTER 
Name : 
Lab: 

Institution : 
Address : 
Tel : 
Fax: 



Jul 6 1998 
Feb 20 2001 

Poly(A) tail, 18 nt : 417.. 434 

Ice plant Lambda Uni-Zap XR expression library, 30 hours 
NaCl treatment 

Mesembryanthemum crystallinum 
Leaf, 3 0 h 0 . 4M NaCl 
Six week old 

Lambda Uni-Zap XR, Bluescript SK- 

EcoRI 

Xhol 

Cushman JC 

Department of. Biochemistry 

University of Nevada 

MS200, Reno, NV 89557-0014, USA 

775-784-1918 

775-784-1650 



3 ? 
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E-mail : 

CITATIONS 
Title : 
plant, 

Authors : 
Year: 
Status : 



j cushman@unr . edu 



An expressed sequence tag database for the common ice 

Mesembryanthemum crystallinum 

Cushman, J.C. 

1997 

Unpublished 
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dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
Clone Id: 
DNA type : 

PRIMERS 
PolyA Tail : 

SEQUENCE 



4982897 . 
AU095068 
AU095068 
8857750 



E51113 
cDNA 



Unknown 



TGGTGCTTCTCATTTGGGCTGCTGCTGCTGCTATTGCAAAACTTGGTGCTCAAGCTACAG 
CTGCACTTGGTACTGTGAAATCAAATGCTATTCAAGCGTTCAACAAGGTTTTNCCATTGA 
TAGAACAGTTAGACAGGTCAGCCATGGAAAATACTAAAGATGGCCCTGGGGGATATCTTG 
AAAATTTTGACCAGGAAAATGCACCTGCTCATGATTCGAGAAATGCCGCCTTGAAGATTA 



TCTCTCTGGCGCACTGTTTGCACTGTTGGCAGTAATTGGGGCCAAATATTTGCCTCGTAA 



GAGGCCCCTTTCTGCTATTAGGAGTGAGCATGGATCTGTGGCAGTTGCTAATAGTGTCGA 



CTCTACTGATGATCCTGCACTAGATGAAGATCCAGTACATATTCCTAGAATGGATGCGAA 
GCTGGCAGAAGATATTGTTCGCAAGTGGCAGAGTATCAAATCTAA 

Entry Created: Jun 3 0 2000 
Last Updated: Apr 3 2002 

COMMENTS 

PROJECT ='RGP" 



LIBRARY 

Lib Name: Rice immature leaf including apical meristem (under long 

day 

condition) 

Organism: Oryza sativa (japonica cult ivar -group) 

Cultivar: Nipponbare 

Develop, stage: immature leaf including apical meristem (under long day 
condition) 



SUBMITTER 
Name : 

Institution: 
Address : 

Tel: 
Fax: 
E-mail : 



Takuj i Sasaki 

National Institute of Agrobiological Resources 

Rice Genome Research Program, Kannondai 2-1-2, Tsukuba, 

Ibaraki 305-8602, Japan 

81-298-38-7441 

81-298-38-7468 

tsasaki@abr . af f rc . go . jp, URL : http : //rgp . dna . af f rc . go . jp/ 



CITATIONS 
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Title: Rice cDNA from immature leaf including apical meristem 

(2000 

) 

Authors: Sasaki,!?., Yamamoto,K. 

Year: 2000 
Status: Unpublished 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
Clone Id: 
DNA type: 

PRIMERS 
PolyA Tail : 



8592489 
AU183658 
AU183658 
14189015 



E51136 
CDNA 



Unknown 
SEQUENCE 

ATCATAAGAAGCGCCAAGAAGGGCTTCAAGGTGCGAGAAACATTTTGTGGAGCGTTGGCA 
GAGGAGGTATTGCTACCGTTGGAGGAGGATTTTCTCGTGAAGCCTTCATGAACGAGGCTT 
TTTTGAGGATGACATCAATTGAACAGATGGATTTCTTTTCAAAAACACCGAATAGCATTC 
CTCCTGAATGGTTTGAAATTTACAATGTAGCACTTGCACATGTCGCTCAAGCAATTATAA 



GTAAAAGGCCACAATTCATCATGATGGCGGATGATCTTTTTGAACAACTCCAGAAGTTCC 
ACATAGGTC 



Entry Created: May 22 2 001 
Last Updated: Apr 3 2 0 02 



COMMENTS 



LIBRARY 
Lib Name : 
day 



PROJECT = ' RGP 1 



Rice immature leaf including apical meristem (under long 



condition) 

Organism: Oryza sativa (japonica cultivar-group) 

Cultivar: Nipponbare 

Develop, stage: immature leaf including apical meristem (under long day 
condition) 



SUBMITTER 
Name : 

Institution : 
Address : 

Tel : 
Fax: 
E-mail : 

CITATIONS 
Title: 
(2001 

Authors : 
Year : 



Taku j i Sasaki 

National Institute of Agrobiological Resources 

Rice Genome Research Program, Kannondai 2-1-2, Tsukuba, 

Ibaraki 305-8602, Japan 

81-298-38-7441 

81-298-38-7468 

tsasaki@abr . af f rc . go. jp , URL : http : //rgp . dna . af f rc . go . jp/ 



Rice cDNA from immature leaf including apical meristem 



Sasaki, T., Yamamoto,K. 
2001 
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Status : 



Unpublished ' 
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dbEST Id: 
EST name: 
GenBank Acc ; 
GenBank gi : 



2462373 
AU058418 
AU058418 
4714451 



CLONE INFO 
Clone Id: 
DNA type : 



E51113_1A 
CDNA 



PRIMERS 
PolyA Tail : 



Unknown 



SEQUENCE 

ATCATAAGAAGCGCCAAGAAGGGCTTCAAGGTGCGAGAAACATTTTGTGGAGCGTTGGCA 

GAGGAGGTATTGCTACCGTTGGAGGAGGATTTTCTCGTGAAGCCTTCATGAACGAGGCTT 

TTTTGAGGATGACATCAATTGAACAGATGGATTTCTTTTCAAAAACACCGAATAGCATTC 

CTCCTGAATGGTTTGAAATTTACAATGTAGCACTTGCACATGTCGCTCAAGCAATTATAA 

GTAAAAGGCCACAATTCATCATGATGGCGGATGATCTTTTTGAACAACTCCAGAAGTTCA 
ACATAGGTTCTCATTATGCTTATGATAATGAGATGG 



Entry Created: 
Last Updated: 



Apr 2 9 1999 
Apr 1 2002 



COMMENTS 



PROJECT ='RGP' 



LIBRARY 
Lib Name : 

Organism: 
Cultivar : 
Develop, stage; 



Oryza sativa Nipponbare immature leaf including apical 
meristem (under long day condition) 
Oryza sativa (japonica cultivar-group) 
Nipponbare 

immature leaf including apical meristem (under long day 
condition) 



SUBMITTER 

Name: Taku j i Sasaki 

Institution: National Institute of Agrobiological Resources 

Address: Rice Genome Research Program, Kannondai 2-1-2, Tsukuba, 

Ibaraki 305-8602, Japan 

Tel : 81-298-38-7441 

Fax: 81-298-38-7468 

E-mail : tsasaki@abr . af f rc . go . jp, URL : http : //rgp . dna . af f rc . go . jp/ 



CITATIONS 
Title: 
Authors : 
Year: 
Status : 



Rice cDNA from immature leaf including apical meristem 

Sasaki , T . , Yamamoto , K . 

1997 

Unpublished 
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dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
Clone Id: 
DNA type: 

PRIMERS 
Sequencing : 
PolyA Tail : 



5613895 

WHE03 65_C09_F17ZS 

BE490117 

9609650 



WHE0365_C09_F17 
CDNA 



Stratagene SK primer 
Unknown 



SEQUENCE 

CAGTGCTTGCAATTGGAGGGCACTTACTGGAGGACCGCCCGCCCAAGCGGTTCAAGCAGG 

ATGTGGTGCTGGCAATGGCGCTCGCTTATGTGGATCTATCAAGGGACGCAATGGCGGCTA 

GCCCTCCAGATGTAATCCGCTGCTGTGAGGTGCTTGAAAGGGCTCTCAAGCTTTTGCAGG 

AGGATGGGGCAATCAATCTCGCACCTGGTTTGCTCTCACAAATTGATGAAACTCTGGAGG 

ATATCACACCTCGTTGTGTTTTGGAGCTTCTTGCCCTTCCTCTTGATGAAAAACATCAGA 

ATGAACACCAAGAAGGTCTTCGTGGTGTGAGAAACATTTTGTGGAGTGTTGGCAGAGGAG 

GTATTGGTACTGTTGGAGGAGGATTTTCGCGTGAAGCCTACATGAATGAAGCCTTCCTGC 

AGATGACATCGGCGGAGCAGATGGATTTCTTCTCAAAAACACCGAATAGCATACCGCCTG 

AATGGTTTGAAATCTATAGCGTGGCACTTGCAAATGTTGCTCAAGCAATTGTAAGTA 



Entry Created: 
Last Updated: 

COMMENTS 

low 



LIBRARY 
Lib Name : 
Organism : 
Cultivar : 
Tissue type: 
Develop, stage: 
Lab host : 
Vector: 
R. Site 1: 
R. Site 2: 
Description: 



Jul 31 2000 
Jul 31 2000 



Sequence have been trimmed to remove vector sequence and 
quality sequence with phred score less than 2 0 



Wheat cold-stressed seedling cDNA library 
Triticum aestivum 
Chinese Spring 
Seedling 

Five-day old seedling 
E . coli SOLR 

Lambda Uni-ZAP XR, excised phagemid 

EcoRI 

Xhol 

Seeds were surface-sterilized, germinated and grown 
aseptically in the dark at room temperature on filter paper 
with water, nystatin and cefotaxime in covered 
crystallization dishes. Five-day old seedlings were 
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tissue , 
was 



OD 



SUBMITTER 
Name : 

Institution: 

Address : 
Tel: 
Fax: 
E-mail : 

CITATIONS 
Title: 

Authors : . 



Year: 
Status : 



transferred to 5 C cold room and kept for 48 hr. The 

total RNA, and poly (A) RNA were prepared, a cDNA library 

made, and the cDNA clones were in vivo excised to give 
pBluescript phagemids in the TJ Close lab (Choi, Close, 
Fenton) at the University of California, Riverside. Plasmid 
DNA preparations and DNA sequencing were performed in the 

Anderson lab (all other authors) . 



01 in Anderson 

US Department of Agriculture, Agriculture Research Service, 

Pacific West Area, Western Regional Research Center 

800 Buchanan Street, Albany, CA 94710, USA 

5105595773 

5105595818 

oandersn@pw.usda.gov 

The structure and function of the expressed portion of the 
wheat genomes - Cold-stressed seedling cDNA library 
Anderson, O.D. , Chao,S., Choi,D.W., .Close, T. J., Fenton, R.D., 
Han,P.S., Hsia,C.C, Kang,Y., Lazo,G.R., Miller, R., Rausch 
, C. J. , Seaton, C.L . , Tong , J . C . 
2000 

Unpublished 




FIG. 8 continued 32/110 



dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
Clone Id: 
DNA type : 

PRIMERS 
Sequencing : 
PolyA Tail : 



8348091 

WHE24 93_E05_J09ZS 

BG607272 

13657255 



WHE2493_E05_J09 
CDNA 



Stratagene SK primer 
Unknown 



SEQUENCE 

ACACCTCGTTGTGTTTTGGAGCTTCTTGCCCTTCCTCTTGATGAAAAGCACCAGAGTAAA 

CGCCAAGAAGGTCTTCGTGGTGTGAGAAACATTTTGTGGAGTGTTGGTAGAGGAGGTATT 

GCTACTGTTGGAGGAGGATTTTCNCGTGAAGCCTACATGAATGAGGCCTTTTTGCAGATG 

ACATCAGCGGAGCAGATGGATTTCTTTTCAAAAACGCCAAATAGCATACCACCTGAATGG 

TTTGAAATCTATAGTGTGGCACTCGCAAATGTTGCTCAAGCAATTGTAAGTAAAAGGCCA 

NAGCTCATCATGGTGGCAGATGATCTTTTCGAACAGCTCCAGAAGTTCAATATAGGTTCT " 

CAATATGCTTATGATAATGAATTGGATCTTGTGTTGGAAAGGGCACTTTGCTCATTGC 



Entry Created: 
Last Updated: 

COMMENTS 

low 



LIBRARY 
Lib Name : 
Organism: 
Cultivar : 
Tissue type: 
Develop, stage: 
Lab host: 
Vector : 
R. Site 1: 
R. Site 2 : 
Description : 



during 



Apr 17 2001 
Apr 17 2001 



Sequence have been trimmed to remove vector sequence and 
quality sequence with phred score less than 2 0 



Triticum monococcum early reproductive apex cDNA library 

Triticum monococcum 

DV92 

Early reproductive apex 
Seven week-old plants 
E, coli XLOLR 

Lambda Uni-ZAP XR, excised phagemid 

EcoRI 

Xhol 

The tissue, total RNA, and poly (A) RNA were prepared from 
apex at double-ridge stage to terminal-spikelet stage 

transition from vegetative state to flower state, a cDNA 
library was made, and the cDNA clones were in vivo excised 
at the University of California, Davis (V. Echenique, B. 
Stamova, J. Dubcovsky) . Plasmid DNA preparations and DNA 
sequencing were performed in the OD Anderson lab (all other 
authors) . 



¥7 
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SUBMITTER 
Name : 

Institution : 

Address : 
Tel : 
Fax: 
E-mail : 



Olin Anderson 

US Department of Agriculture, Agriculture Research Service, 

Pacific West Area, Western Regional Research Center 

800 Buchanan Street, Albany, CA 94710, USA 

5105595773 

5105595818 

oandersn@pw.usda.gov 



CITATIONS 
Title: 



Authors : 
Han,P.S. 



Year: 
Status : 



The structure and function of the expressed portion of the 
wheat genomes - Early reproductive apex cDNA library from 
Triticum monococcum 

Anderson, O.D. , Chao,S. , Dubcovsky , J. , Echenique,V. , 

, Hsia,C.C, Kang,Y., Lazo,G.R., Miller, R., Rausch,C.J., 

Seaton, C . L . , Stamova,B., Tong,J.C. 

2001 

Unpublished 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
Clone Id: 
Source : 
DNA type : 

PRIMERS 
Sequencing : 
PolyA Tail : 

SEQUENCE 



9919900 

HVSMEl0017D16f 

BI949952 

16291659 



HVSMEl0017D16f 

CUGI 

CDNA 



AATT AAC C CT C AC T AAAGGG 
Unknown 



GCGAGCATGAGTCCGTGGCAGTTGCTAATGTTGTTGACTCAGGTGATGATGACGAACCAG 

ATGAGCCCATACAGATTCCTAAAATGGATGCGAAGCTGGCAGAAGATATTGTTCGCAAGT 

GGCAGAGCATCAAATCCAAGGCCTTGGGATCAGATCATTCTGTTGCATCATTGCAAGAGG 

TTCTTGATGGCAACATGCTGAAGGTATGGACGGACCGAGCAGCAGAGATCGAGCGCAAAG 

GCTGGTTCTGGGACTACACGCTGTCCAACGTGGCGATCGACAGCATCACCGTCTCCCTGG 

ACGGACGGCGGGCGACCGTGGAGGCGACAATTGAGGAGGCGGGTCAGCTCACCGACGCAA 

CCGACCCCAGGAACGATGATTTGTACGACACTAAGTACACCACCCGGTACGAGATGGCCT 

TCACCGGACCAGGAGGGTGGAAGATAACCGAAGGCGCAGTCCTCAAGTCGTCATAGGGCG 
Quality: High quality sequence stops at base: 474 



Entry Created: 
Last Updated: 

COMMENTS 



LIBRARY 
Lib Name : 

Organism: 
Cultivar : 
Tissue type: 
Lab host: 
Vector : 
R . Site 1: 
R . Site 2 : 
Description : 



Oct 19 2001 
Oct 19 2001 



Total hq bases = 422 



Hordeum vulgare spike EST library HVcDNA0 012 (Fusarium 

infected) 

Hordeum vulgare 

Morex 

Spike 

TJC121 

pBluescript SK(-) 

EcoRI 

Xhol 

Plants were grown at the University of Minnesota in the GJ 
Muehlbauer lab; spikes were harvested and snap frozen at 0, 
1, 2, 3, 4, 5, 6, and 8 days after Fusarium graminearum 
inoculation (Heinen) . In the TJ Close lab at the University 
of California', Riverside, total RNA was prepared from each 
sample pool, equal quantities of all eight RNA pools were 



47 
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combined, poly (A) RNA was purified from the mixture, one 
primary unamplified cDNA library was made, and 1 million 

pfu 

were in vivo excised to give pBluescript SK(-) cDNA 
phagemids (Choi, Fenton, Malatrasi) . Phagemids were plated 
and picked at the Clemson University Genomics Institute 
(CUGI) (Begum, Palmer, Frisch, Atkins and Wing) . Plasmid 

DNA 

preparations, DNA sequencing and sequence analysis were 
performed at CUGI (Wing, Yu, Frisch, Henry, Simmons, Oates, 
Rambo, Main) . The sequence has been trimmed to remove 

vector 

sequence and contains a minimum of 100 bases of phred value 
20 or above. For more details on library preparation and 
sequence analysis see 

http://www.genome.clemson.edu/projects/barley. To order 

this 

clone see http://www.genome.clemson.edu/orders Also see 
Close TJ, Wing R, Kleinhofs A, Wise R (2001) Genetically 

and 

physically anchored EST resources for barley genomics. 
Barley Genetics Newsletter 31:29-30. 

(ht tp : / /wheat . pw . usda . gov/ggpages/bgn/ 3 l/cover . html ) 



SUBMITTER 

Name : Wing RA 

Lab: Clemson University Genomics Institute 

Institution: Clemson University 

Address: 100 Jordan Hall, Clemson, SC 29634, USA 

Tel: 864 656 7288 

Fax: 864 656 4293 

E-mail: rwing@clemson.edu 

CITATIONS 

Title: Development of a genetically and physically anchored EST 

resource for barley genomics : Fusarium infected Morex spike 
cDNA library 

Authors: Wing,R., Muehlbauer , G . J . , Close, T. J., Kleinhofs , A. , 

Wise, R . , 

Heinen,,S., Begum, D., Frisch, D., Yu,Y., Henry, D., Palmer, M. , 
Rambo ,T., Simmons, J., Fenton, R.D., Malatrasi , M . , Choi,D.W., 
Oates , R. , Main, D . 

Year: 2001 

Status : Unpublished 



5o 
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dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 



8864363 
AV833644 
AV833644 
14525733 



CLONE INFO 
Clone Id: 
DNA type : 



bagsldll 
CDNA 



PRIMERS 
PolyA Tail ; 



Unknown 



SEQUENCE 



GAAACTCTGGNNGNAGATCACCCCTCGTTGTGTTTTAGAGCTTCTTGCCCTTCCTCTTGA 
CGAGNAAGCACCAGAGTAAACGCCAAGNAAGGTCTTCGTGGTGTGAGAAACATTTTGTGG 
AGTGTTGGTAGAGGAGGTATTGCTACTGTTGGTGGAGGATTTTCACGGGAAGCCTACATG 
AATGAGGCCTTTTTGCAGATGACATCAGCTGAGCAGATGGATTTCTTTTCAAAAACGCCG 



AATAGCATACCACCTGAATGGTTTGAAATCTATAGCGTGGCACTCGCAAATGTTGCTCAA 
GCAATTGTAAGTAAAAGGCCAGAGCTCATCATGGTGGCAGATGATCTTTTCGAACAGCTC 
CAGAAGTTCAATATCGGTTCTCAATATGCTTATGGTAACGAGATGGATCTTGCGTTGGAA 
AGGGCACTTTGCTCATTGCTTGTGGGAGACATTAGCAACTGCAGAACTTGGCTTGCGATT 



GATAATGAATCTTCACCACATAGAGACCCGAAAATTGTAGAGTTTATTGTGAACAACTCT 

AGCATTGACCACCAGGAGAATGATCTTCTTCCAGGCCTGTGTAAGCTTTTGGAGACTTGG 
CTTGTCTCAGAGGTTTTCCCTA 



Entry Created: 
Last Updated: 

COMMENTS 

Direct 



LIBRARY 
Lib Name : 

Organism : 
Cultivar : 
Tissue type: 
Develop, stage: 

SUBMITTER 
Name : 
Lab: 



Jun 22 2001 
Jun 22 2001 



Sato,K., Saisho,D., Takeda,K., Shini,T. and Kohara,Y. 
submission; 

database :http : //www . shigen . nig . ac . jp/barley/Barley . html 

K. Sato unpublished cDNA library: Hordeum vulgare subsp. 

vulgare shoots germination 

Hordeum vulgare subsp. vulgare 

Haruna Nijo 

shoots 

germination 



Kazuhiro Sato 

Research Institute for Bioresources 
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Institution: Okayama University, Barley Germplasm Center 
Address: Chuo 2-20-1, Kurashiki, Okayama 710-0046, Japan 

E-mail : kazsato@rib . okayama-u . ac . jp, 

URL : http : / /www . r ib . okayama-u . ac . jp /bar ley/ 



CITATIONS 
Title: 
Authors : 
Year: 
Status : 



Barley EST sequencing project in NIG and Okayama Univ. 

Sato,K. 

2001 

Unpublished 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 



10841891 
AV921157 
AV921157 
18216936 



CLONE INFO 
Clone Id: 
DNA type : 



bagsldll 
cDNA 



(3') 



PRIMERS 

PolyA Tail: Unknown 
SEQUENCE 

TGGCTTCACCTGNAAATCCAGCACTAAGTTTCTCTTATCACCAACCCAAGGATCTCTTCT 
AGCCTAGCAATAATCCGAATAGAACACACCGAAAAACAAAGCTCATCGCTGACTAACTGA 
CTAACCAAACTATCTCCGTCTTCCAAACTGACAAGAGCCTAGACTAGACTGCTTATTTAC 
ACACCAGAAAAACACGGGAGGAATCAATCAACAAGGTTTACTGCACGCTGAACGCCCTAT 
GACGACTTGAGGACTGCGCCTTCGGTTATCTTCCACCCTCCTGGTCCGGTGAAGGCCATC 
TCGTACCGGGTGGTGTACTTAGTGTCGTACAAATCATCGTTCCTGGGGTCGGTTGCGTCG 
GTGAGCTGACCCGCCTCCTCAATTGTCGCCTCCACGGTCGCCCGCCGTCCGTCCAGGGAG 
ACGGTGATGCTGTCGATCGCCACGTTGAACAGCGTGTAGTCCCAGAACCAGCCTTTGCGC 
TCAATCTCTGCTGCTCGGTCTGTCCATACCTTCAGNATGTTGCCATCAAGAACCTCTTGC 
AATGATGCAACAGAATGATCTGATCCCAAGGCCTTGGATTTGATGCTCTGCCACTTGCGA 



ACAA 



Entry Created: 
Last Updated: 



Jan 18 2002 
Jan 18 2002 



LIBRARY 
Lib Name : 



K. Sato unpublished cDNA library, cv. Haruna Nij 

germination shoots 

Hordeum vulgare subsp. vulgare 

Haruna Nijo 

shoots 

germination 



Organism : 
Cultivar : 
Tissue type: 
Develop, stage: 



SUBMITTER 
Name : 
Lab: 

Institution : 
Address : 
Tel : 
Fax: 
E-mail : 



Tadasu Shin-i 

Center For Genetic Resource Information 

National Institute of Genetics 

1111 Yata, Mishima, Shizuoka 411-8540, Japan 

81-559-81-6856 

81-559-81-6855 

tshini@genes.nig.ac.jp 
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CITATIONS 

Title: Barley EST sequencing project in NIG and Okayama Univ 

Authors: Sato,K., Saisho,D., Takeda,K. 

Year: 2002 

Status: Unpublished 



FIG. 8 continued 40/110 



dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
DNA type: 

PRIMERS 
Sequencing : 
PolyA Tail : 

SEQUENCE 



6212986 

OV1__8_A03 .gl_A002 

BE917942 

10420549 



cDNA 



PolyTMix 
no 



TATGGGTCTGTGGCAGTTGCTGACTCTGTTGATGGTCTGGGAGCAGATGAAGAGCCACTA 

GAAATTCCTAGAATGGATGCAAAGTTGGCTGAAGATATTGTTCGCAAGTGGCAAAGTATC 

AAGTCCAAGGCTTTGGGGCCAGAACACACTGTCACGGCATTGCAAGAGATCCTCGATGGC 

AACATGCTGAAGGTATGGATGGACCGAGCCACAGAGATTGAGCGTCACGGTTGGTTCTGG 

GAATACACACTCTCCGACGTGACGATCGACAGTATCACCGTCTCCATGGACGGTCGACGG 

GCAACTGTGGAGGCGACGATTGAGGAGATGGGCCAACTTACCGACGTAGCAGACCCAAAG 

AACAACGACGCCTACGACACAAAGTACACCGCTCGGTACGAGATGAGCTACTCCAAGTCC 

GGAGGGTGGAGGATCACCGAAGGAGCAGTCCTCAAGTCGTAGAACGGTCGTGCAGCAGGA 

GTAGGCGAGTAGGGGTTGCTCAACTCCCATTCTTTTTTCTTTTGCACCAGTGTATGTAAA 

TAAACAGTGTGAGCACAGGTTCTTTTCTCTCCTGGAGAGAGTTTGGTTAGGTTGATTAGT 

GATGAGTTCCTGAGGCCGAGAGAATTTGTCATCTAGTTTGTATTGATAGAGAT 
Quality: High quality sequence starts at base: 17 

Quality: High quality sequence stops at base: 640 



Entry Created: 
Last Updated: 

COMMENTS 



Sep 29 2000 
Sep 29 2000 



Sequences have been trimmed to exclude PolyA, vector and 
regions below Phred quality 16. The threshold for highest 
quality sequence is 20. 

LIBRARY 

Lib Name: Ovary 1 (OV1) 

Organism: Sorghum bicolor 

Organ: Mix of ovaries of varying immature stages from 8 -week-old 

plants 

Vector: pBluescript II from Lambda Zap II 

R. Site 1: Xhol 
R. Site 2: EcoRI 

Description: , The library was made from poly-A RNA in the cloning vector 
lambda ZAP II. Clones to be sequenced were prepared by mass 
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excision. 

SUBMITTER 

Name: Cordonnier- Pratt MM 

Lab: Laboratory for Genomics and Bioinf ormatics 

Institution: The University of Georgia, Department of Plant Biology 
Address: Plant Sciences Building, Rm. 2502, Athens, GA 30602-7271, 

USA 

Tel: 706 542 1860 

Fax: 706 583 0210 

E-mail : mmpratt@uga . edu 



CITATIONS 
Title: 

Authors : 

Year : 
Status : 



An EST database from Sorghum: ovaries of varying immature 
stages 

Cordonnier- Pratt , M. -M. , Gingle,A., Marsala, C, Sudman,M., 

Pratt, L.H. 

2000 

Unpublished 



5£ 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 
CLONE INFO ' 
DNA type : 
PRIMERS 
Sequencing : 
PolyA Tail : 



no 



JEN REV 



6213567 

OV1_8_A03 .bl_A002 



BE918523 
10421712 



CDNA 



SEQUENCE 

GCACGAGGATAGAACAGCTAGACAGATCAGGCAAGGATACCCCAGGTGATGATCTTGAGA 

AATCTCTTGAAAAACTTGCCCAAGAAATGTTGCTGGAGATGCTATCCATGATTCCAAAAA 

TGCCGCTTTGAAGATTATCTCTGCTGGTGCACTGTTTGCACTATTTGCAGTAATAGGTCT 

GAAGTGCTTGCCTCGTAAGAAGTCACTTCCTGCTCTTAAGAGCGAATATGGGTCTGTGGC 

AGTTGCTGACTCTGTTGATGGTCTGGGAGCAGATGAAGAGCCACTAGAAATTCCTAGAAT 

GGATGCAAAGTTGGCTGAAGATATTGTTCGCAAGTGGCAAAGTATCAAGTCCAAGGCTTT 

GGGGCCAGAACACACTGTCACGGCATTGCAAGAGATCCTCGATGGCAACATGCTGAAGGT 

ATGGATGGACCGAGCCACAGAGATTGAGCGTCACGGTTGGTTCTGGGAATACACACTCTC 

CGACGTGACGATCGACAGTATCACCGTCTCCATGGACGGTCGACGGGCAACTGTG 
Quality: High quality sequence stops at base: 447 

Entry Created: Sep 29 2000 
Last Updated: Sep 29 2000. 
COMMENTS 



Sequences have been trimmed to exclude PolyA, vector and 
regions below Phred quality 16. The threshold for highest 
quality sequence is 20. 



LIBRARY 
Lib Name : 
Organism: 
Organ : 



Ovary 1 (OV1) 
Sorghum bicolor 

Mix of ovaries of varying immature stages from 8 -week-old 
plants 

pBluescript II from Lambda Zap II 

Xhol 

EcoRI 

The library was made from poly-A RNA in the cloning vector 
lambda ZAP II. Clones to be sequenced were prepared by mass 
excision. 



Vector: 
R. Site 1: 
R. Site 2 : 
Description: 



SUBMITTER 
Name : 
Lab: 

Institution: 
Address : 
USA 
Tel : 
Fax : 
E-mail : 



706 542 1860 
706 583 0210 
mmpratt@uga . edu 



Cordonnier-Pratt MM 

Laboratory for Genomics and Bioinf ormatics 

The University of Georgia, Department of Plant Biology 

Plant Sciences Building, Rm.. 2502, Athens, GA 30602-7271, 



57 



FIG. 8 continued 43/110 



CITATIONS 

Title: An EST database from Sorghum: ovaries of varying immature 

stages 

Authors: Cordonnier-Pratt ,M. -M. , Gingle,A., Marsala, C . , Sudman,M., 

Pratt, L.H. 
Year: 2000 
Status : Unpublished 



5/ 
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dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 
CLONE INFO 
Plate: 
DNA type: 
PRIMERS 
PolyA Tail : 
SEQUENCE 



11076385 
952021B01.X1 
BM498278 
18649459 

952021 Row: 
cDNA 

Unknown 



B Column: 01 



GCCACAGGCCGCCACCGCCTGGCCCCTCCACCTGCCGCTCCGCCAGCCGCTGGGCCGACC 
GCCTCTTCGCCGACTTCCACCTCCTCCCCGCCGCCGCCGACCCGCCAGCCGCGGCCTCCT 
CTTCCTCCTCGTCCCCGTTCGTCCCGATCTTCCCCGAAGCCGCCGACCGCGCCTTGCCCC 
TCCCGGTCGACTTCTACAAGATTCTTGGTGCGGAGCCACATTTCCTAGGCGATGGCATTC 
GGAGGGCGTTCGAGTCGCGGATAGCTAAGCCACCTCAGTATGGGTACAGCACAGAAGCTC 
TTGCTGGGCGACGGCAAATGCTGCAGATTGCCCATGATACTCTCACAAACCAGAGCTCGC 



GCACCGAGTACGACCGTGCGCTTTCCGAGGACCGTGATGCGGCACTCACCATGGATGTTG 



Entry Created: 
Last Updated: 
LIBRARY 
Lib Name : 
Organism : 
Cultivar : 
Tissue type: 
Develop, stage: 
Lab host : 
Vector : 
R. Site 1: 
R. Site 2: 
Description: 



positive 

SUBMITTER 
Name : 
Lab: 

Institution : 
Address : 
Tel: 
Fax : 
E-mail : 



CCTGGGATAAGGTTCCAGGTGTGCTGCGTGTGCTTCAGGAGGCTGGGGAGGCACAACTG 

Feb 11 2002 

Feb 11 2 0 02 

952 - BMS tissue from Walbot Lab (reduced rRNA) 
Zea mays 

BMS (Black Mexican Sweet) 
suspension culture 

mixed logarithmic and stationary growth phases 

DH10B 

pUC19 

EcoRI 

EcoRI 

The library was prepared by George Rudenko using poly (A) 
selected RNA and Universal Riboclone cDNA Synthesis System 
(Promega) . cDNA was synthesized using both random and 
oligo(dT) primers in separate reactions and equipped with 
EcoRI adaptors. Library was size-fractionated on agarose 
gels (for insert size >400bp) and non-directionally cloned 
into EcoRI -digested pUC19 vector. Blue/white selection on 
carbenicillin-containing plates was used to recover 

clones . 

Walbot V 

Department of Biological Sciences 
Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 
650 723 2227 
650 725 8221 
walbot@s tanf ord , edu 
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CITATIONS 

Title: Maize ESTs from various cDNA libraries sequenced 
Stanford 

University. 

Authors: Walbot,V. 

Year: 1999 

Status: Unpublished 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
Plate: 
DNA type: 

PRIMERS 
PolyA Tail : 

SEQUENCE 



11076864 
952021B01.yl 
BM498757 
18649938 



952021 Row: 
cDNA 



Unknown 



B Column: 01 



AGCAATGTGGGCAAGTGCGACACTATAGATCTCAAACCATTCAGGTGGTATGCTATTCGG 

TGTTTTAGAGAAGAAATCCATCTGCTCAGCTGATGTCATCTGCAAGAAAGCCTCATTCAT 

GAAGGCCTCACGAG7UVAATCCTCCTCCAACAGTAGCAATACCACCCCTGCCAACACTCCA 

CAATATGTTTTTTGCACCTTGCAGACCTTCTTGGCGTTTATTTTTATGTTTTTCATCAGT 

AGGAAGAGCAAGAAGCTCCAATACACAACGAGGTGTAATCTCCTCCAAAGTTTCATCAAT 

CTGTGCAAGCAGTTCAGGTGCAAGATTGCTTGCACCATCCTCCTGCAGGAGCTTCAGTGC 

CCTCTCAAGCACCTCACAACAGCAGATTACATCTGGAGGGCTTGCTGCCATAGCATCCCT 
TG AT ATGT C C AC AT AAGC C AATGC C A 



Entry Created: 
Last Updated: 

LIBRARY 
Lib Name : 
Organism: 
Cultivar : 
Tissue type: 
Develop, stage: 
Lab host : 
Vector : 
R. Site 1: 
R. Site 2: 
Description: 



positive 



Feb 11 2002 
Feb 11 2002 



952 - BMS tissue from Walbot Lab (reduced rRNA) 
Zea mays 

BMS (Black Mexican Sweet) 
suspension culture 

mixed logarithmic and stationary growth phases 

DH10B 

pUC19 

EcoRI 

EcoRI 

The library was prepared by George Rudenko using poly (A) 
selected RNA and Universal Riboclone cDNA Synthesis System 
(Promega) . cDNA was synthesized using both random and 
oligo(dT) primers in separate reactions and equipped with 
EcoRI adaptors. Library was size-fractionated on agarose 
gels (for insert size >400bp) and non-directionally cloned 
into EcoRI -digested pUC19 vector. Blue/white selection on 
carbenicillin-containing plates was used to recover 

clones . 



SUBMITTER 
Name : 



Walbot V 
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Lab: 


Department of Biological Sciences 


Inst it ut ion : 


Stanford University 


Address : 


85 5 California Ave, Palo Alto, CA 94 3 04, USA 


Tel : 


650 723 2227 


Fax : 


650 725 8221 


E-mail : 


walbot@stanford.edu 


CITATIONS 




Title: 


Maize ESTs from various cDNA libraries sequenced 


Stanford 


University. 


Authors : 


Walbot, V. 


Year: 


1999 


Status : 


Unpublished 
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dbEST Id: 
EST name: 
GenBank Acc ; 
GenBank gi : 

CLONE INFO 
Plate: 
DNA type : 

PRIMERS 
PolyA Tail : 

SEQUENCE 



3713166 
707034D03 .x3 
AW331058 
6827415 



707034 Row: 
CDNA 



Unknown 



D Column: 03 



CGCGTCGACGTATAGAGTCTGCATCCATGTTGCCTTGAATGAAGCGTCTGCAAAAGAAGG 

CTCTTTTATCACCAGTCGTGTCAGGAAGCATTTTGAAAATATATCAAAATTTCTTTGGCT 

G AGTG AT AGGC C T AATTCAAAT AGC AAAGGAAGTG AT AAAC AC C C AG CGGTT AATGATAT 

TACTGCTGCAGTTTGCAAGCAAAAGATGGATATTCAAGAAGCAGAAACACTTGTAAAACA 

GTGGCAAGACATAAAATCTGAAGCTCTTGGCCCTGACTATCAAACTGACATGCTACCTGA 

GATTCTTGATGGTTCAATGCTCTCTAAGTGGGAAGACTTAGCGTTATTAGCAAAGGACCA 

GTCTTGCTATTGGAGATTTGTGCTGCTAAATCTTAATGTTGTTCGAGCCGAGATAATCTT 

GGATGAAATAGGTGCTGGTGAGGCAGCAGAAATTGATGCTGTACTTGAGGAAGCGGCTGA 

GCTTGTTGACGATTCCCAGCCCAAGAAACCGAGTTATTACAGCACATATGAAGTTCAGTA 

CGTATTGAGGAGGCAGAATCATGGATCTTGGAAAATCTCCGAGGCTGCTGTCCGGGACCT 

GACGTGATTTCTGCCAACTCGGCAAACGGGCTACACAACCATTGGCGTATAGGCGGC 



Entry Created: 
Last Updated: 

LIBRARY 
Lib Name : 
Organism: 
Cultivar : 
Organ : 

Tissue type: 
Develop, stage: 
Lab host : 
Vector : 
R. Site 1: 
Description: 
an 



SUBMITTER 
Name : 



Jan 31 2000 
Jan 31 2000 



707 - Mixed adult tissues from Walbot lab 

Zea mays 

W23 

kernel, silk, 
kernel, silk, 



(SK) 



husk, 
husk, 



root , 
root , 



leaf 
leaf 



tassel, 
tassel, 
adult 
DH10B 
pGADIO 
EcoRI 

cDNA library from fully differentiated maize tissues from 

active Mutator plant. Tissue ratio is 4/2/l/l/l/l (tassel, 
kernel, silk, husk, root, leaf). Unidirectionally cloned. 



Walbot V 



&3 



FIG. 8 continued 49/110 



Lab: 

Institution : 
Address : 
Tel : 
Fax: 
E-mail : 

CITATIONS 

Title: 

Stanford 

Authors : 
Year : 
Status : 



Department of Biological Sciences 
Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 
650 723 2227 
650 725 8221 
walbpt@stanf ord . edu 



Maize ESTs from various cDNA libraries sequenced at 

University. 
Walbot , V. 
1999 

Unpublished 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 



5882137 

Cri2_3_H15_SP6 

BE641509 

9959174 



CLONE INFO 
Clone Id: 
Plate: 
DNA type : 



Cri2_3_H15 
Cri2_3 Row: 
CDNA 



(5') 

H Column: 



15 



PRIMERS 

Sequencing: SP6 
PolyA Tail : Unknown 

SEQUENCE 



GTGGTGTCTTTGCTCGTGTTCCTGGATACACAAGGGATGAGTATATGAAGGCAGCTTTTT 
CTCGAATGACAGCTGCTGAGCAAGTAGCTTTGTTCACAAATACACCCAGTAATATCCCAG 



CAGAGAGTTCTGAGGTTTACACAGTTGCGCTTGCTCACATAGCAGAGGGATTTGTTGCAA 
AGAAGCCGCAATTGATTCAGGAAGCTGATTCACTCTTTCTTCAGCTTCAGCGAACAAATG 



CCTCATCATCTAGTTTGCTAGTTACTGGTGGTCTACGGCCATTATCAAGTCTGCAGCTTG 
ATTTTGCTTTTGAACGAGCCATGTGCAAACTGCTCCTAGGAGAACTGGATGGTTGTCGTG 
CATGGCTAGGTTTGGATGATACAAACTCTCCATATAGAGACCCTGCAGTGACTGATTTTG 
TTATAGCTAATTCTTTTGGAAGTGAGGAAGGTGATTATTTACCAGGCCTTTGCAAGTTGT 
TGGAAAGTTGGTTGAGGGAAGCGGTGTTTTTCCCCAACCCGTCAACAGAAAAGTGGAGGT 
ACAAGTTGAGGGAGTATTTTTTATGATGCAAGGAGAAAAAAAGCCGCCGTGAATTTTTTC 



GCGGGGGGCGCTATGAAAAAATATATTCAACCTTTTTTTGTTGGGGCGTCGTCTACAAAG 
AATGATGGAGTGTCATTGTTGCTTTTGAGGTGACGAAGGGGCGGCGCTCCTCTTTAAGGG 



ATCGTCCGTGGGGGCGCGCGCTCCCATATCGCCATCTTCGGGACACCTTGTTCGTGGGTC 



AAATGGTGATGTCTTTTTTACCACGAACGTCACATTATTCTTATAATATAAGCGTGCGGC 



Entry Created: 
Last Updated: 
LIBRARY 
Lib Name : 
Organism: 
Cultivar : 
Tissue type: 
Cell type: 
Develop, stage: 
Vector : 
Description : 



AGCACTCTCAGCTTCGACGAAACAGCCTAAA 
Sep 1 2000 
Sep 1 2000 

Ceratopteris Spore Library 
Ceratopteris richardii 
Brogn 

Gametophyte 
Spore 

20 hours after germination initiation 
pCMVSPORT6 

EST sequence from cDNA library. cDNA library constructed 
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from mRNA isolated from C. richardii spores that had 




developed for 20 hours after their germination had been 




initiated by white light. 


SUBMITTER 




Name : 


Roux SJ 


Lab: 


Section of Molecular Cell and Developmental Biology 


Institution : 


University of Texas 


Address : 


Biology Building, Room 16, Austin, TX 78712, USA 


Tel : 


512 471 4238 


Fax : 


512 232 3402 


E-mail : 


sroux@uts.cc.utexas.edu 


CITATIONS 




Title: 


Expressed sequence tags of cDNA clones from a C. richardi 




library 


Authors : 


Chatter j ee, A. , San Miguel, P., Stout, S.C., Banks, J., Roux 
, S.J. 


Year : 


2000 


Status : 


Unpublished 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
Clone Id: 
Source : 

DNA type : 

PRIMERS 
PolyA Tail : 



9279697 
gc56a02 .yl 
BI437111 
15261801 



PEP_SOURCE_ID:PPN190104 (5 * ) 

University of Leeds (UK) & Washington University in St. 

Louis (USA) 

cDNA 



Unknown 
SEQUENCE 

GAGAACGGAAGCTTTAGAAGTGGAGGTTGTCCCCAAAATGGATGCTAGGTTGGCGGAAAT 

TATGGTTCGAAGATGGCAAGCAGCTAAAGCTCGAGCACTTGGTTCTGCTCATGATATGGC 

GGCTCTTCCTGAGGTGCTGGAGGGCGAGATGCTGAAGAGCTGGACAGACCGTGTTAGTGA 

CGTCAAGAGAAATGGTTGGTTTTGGGAATACACTCTCCTTGGTCTTCACATTGATAGTGT 

AACAGTAAGTGACGATGGGAGGCGAGCAACTGCGGAAGCCACTTTGCAAGAGGCAGCCCG 

CTTGGTGGACCGCAACAACCCTGACCACAATGATTCTTATAGAAGCACTTACACTACGCG 

ATATGACCTCCGGCATGGCATAGATGGTTGGCGAATCAATGGAGGAGCTGTGCTGCGTAC 

TTGATTCTGAGATTTTCATCTCCGGATCATGTTGACTTGTAGGCAGATCGACTAGTTGCA 

ACCCTTGCATGCTACGAATGAGTAGTCTTTTTGGATATTTTGATCCATCATGCAGCTTTG 
A 

Quality: High quality sequence stops at base: 424 



Entry Created: 
Last Updated: 

COMMENTS 

part 



LIBRARY 
Lib Name : 
Organism: 
Tissue type: 
Lab host: 
Vector : 
R. Site 1: 



Aug 21 2 001 
Aug 21 2001 



Libraries were constructed by Dr. Stavros Bashiardes as 

of the Physcomitrella EST program (PEP) at the Univ. of 
Leeds (UK) and Washington Univ. in St. Louis (USA) DNA 
sequencing by: Washington University Genome Sequencing. 
Center For information on obtaining a clone please contact 
Celia Knight (c.d.knight@leeds.ac.uk) 



Moss EST library PPN 
Physcomitrella patens 

protonemata : 7 day old tissue auxin treated 
DH10B 

pBluescript SK- 
EcoRI 
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R. Site 2: 
Description : 



is 



of 

with 
to 



SUBMITTER 
Name : 
Lab: 

Institution : 

Address : 

USA 

Tel: 

Fax : 

E-mail : 

CITATIONS 
Title: 
Authors : 



Year: 
Status : 



Xhol 

Construction of -the cDNA library was carried out using 
Stratagenes 'UniZAP - cDNA synthesis kit'. cDNA was 
constructed using an oligo dT primer/linker that contains a 
Xhol site within it. Following ds cDNA synthesis, EcoRI 
adapters were ligated to the blunt ends and sample was 
digested with Xhol. The result is cDNA with an EcoRI sticky 
end on one side and a Xhol sticky end on the other. This 
cDNA was ligated directionally in UniZAP arms. The vector 

designed containing the pBluescript sequence as well as 
lambda DNA and cDNA is cloned within this pBluescript 
sequence. The vector was then packaged using Gold 
gigapackaging extracts. Library was grown in XLIBlue MRF ' 
cells and amplified. The library was excised by mass 
excision using Stratagens 'Mass excision kit' that uses 
exassist as a helper phage that releases the pBluescript 
sequence and circularises it as single stranded plasmids 
that are then packaged (by helper phage) and secreted out 

the host cell as phagemids . SOLR cells were transformed 

phagemids and the library was plated out on LB-amp plates 

select for transf ormants . Approximately 1,000,000 colonies 
were grown and recovered. The double stranded plasmid 
library was recovered by using Quiagen Midi prep kit. 2 
micro grams of each library were used to transform DH10B 
cells by electroporation . 



Ralph Quatrano 

Leeds/Wash U Moss EST Project 
Washington University School of Medicine 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108, 

314 286 1800 
314 286 1810 
est@watson.wustl . edu 



Leeds/Wash U Moss EST Project 

Quatrano,R., Bashiardes , S . , Cove,D., Cuming, A., Knight, C, 
Clifton, S., Marra,M., Hillier,L., Pape,D., Martin, J., Wylie 
,T., Underwood, K. , Theising,B., Allen, M., Bowers, Y., Person 
,B., Swaller,T., Steptoe,M., Gibbons,M., Harvey, N., Ritter 
,E., Jackson, Y. , McCann,R., Waterston, R . , Wilson, R. 
1999 

Unpublished 
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Prochlorococcus marinus sp . MED4 analysis files DRAFT 

Produced for the Joint Genome Institute Microbial Sequencing program. 

N.B. : These pages subject to frequent change - work in progress. 

http://genome.ornl.gov/cgi- 

bin/ JGI_microbial/gene_viewer . cgi?org=pmar_med&chr=l&contig=pmar__med&gene=5 
33 

Version 1 - pmar_med Gene 53 3 
Gene Finders 
Strand = r 

Stop Location = 1236816 
Stop Codon = tag 

Gene Modeler Start Location Start Codon 

Generation 1238441 atg 

Glimmer 1238837 ttg 

Critica 1238924 ttg 

MRNA 

ttggaacttccattagatcactttcgtttaataggcgtaagcccctcagcaacatctgaggaaatattaagggct 
ttcca 

attacgcttggataaaactcctgatgaaggattcacgtacgaggttttaactcaaaggtcggaattgcttcgcct 
tactg 

cagatttgcttacagatccagatagtagaagagattacgaaaatttattactaaatggagcatcaggtttagatt 
tatct 

tccaatagagaggttgcaggattaattctcctttgggaatcgggctcttctaaagaagcctttaaaataacaaga 
aaagc 

attgcaacccccccaaactcctgcattgggtagcagtagagaagctgatcttaccttgttagcggctttaacatc 
tagag 

atgctgcaatacaagagcaagatcaaagatcttactcaaatgctgcagattttttacaagaaggcatacagcttc 
ttcaa 

agaatgggcaaactaggggaattacggaaaactcttgaggaggacttagtgtcgcttcttccgtatcgaattctt 
gattt 

gttaagtagagatctaaatgattatgactcgcataaaaaaggtttaagtatgctggaaaatttaataatcaaaag 
aggtg 

gattagaaggaaaaaataaatctgaatataatgattttctaaatcagcaagaatttgaatctttctttcaacaaa 
taaag 

ccattcttgactgttcaggatcagatagatttatttttagaattacaaaaaaggggttcaagtgaagcaggattt 
ttagc 

ttttttatctttaacagcaattggttttgcaagaagaaaacctgcaaaattattcgaagctcgaaaaatattaaa 
aaaac 

taaatttatcaggacttgactcaatgccattaataggttgccttgatttgcttttagcagatgttgagcaatcct 
cagca 

aggtttttaagtagttccgatgagaagttaagagattggttgaataattatcctggagaaaaattagaagcaata 
tgtat 

tttttgtaaaaattggttagaaaatgatgttttggttggttatagggatattgatttaaaagaaatcgatttaga 
ctctt 

ggtttgaagatagagaaatccaagaatttattgagcaaatagaaaagaagtcaaatagaactgtgtttaagtctg 
ggcct 

caaaataaacctat.ttttcaagcccaagaatctttaaaagattcaagtacgggccctgatttaaattcggataat 
tttga 

agaaggccgattacctttgcctggaggagtaagagaagatggtcaagaagttattgaagaaaatatttatacaga 
tgaga 

ttattaaaaacaaatcaatagaattttataagtacgcaatagaaaaaattgctgaattaaaatt tgtat ttggag 
aagcc 

6? 
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ttagagaactacagaatatttaataaatcttcctacctaacatatctgtatgcttttttgattttatttgctttt 
ggcct 

aggtgttggatttgtaagaaataatcteaaaaaacccgtgcaggaaaaagaaataattgataactcgttatcgat 
aaatg 

aaaataagaatgtcttttatgaaggtttaaatcaagatgataaaaagaaagttctcgataactcaaaaattattc 
tctca 

gataatgcagaaaaagttattttttcaggtgaagaaataaaaactgcttctccctccttagaaaaaatagaaaat 
ttaat 

taatacatggcttgttaacaaaagtaaatttctagcaggaaaaggtgaaattaatttatcaaagatagttcaaga 
tgatt 

tgattgatagattaaagaaggaaagagaacttgatattcaaaaaggtatctacaaaaatatcaatgctaatatcg 
aaaat 

attgtacttttaactcaaacggcatcaagaatatcagtatcagttgacttaaagtattcagaaaaaatattaaaa 
ataga 

tggggaattgataaatgaaacaactttcactccttttttgaaagttaaatatattttaggtttctcaaataactc 
ctgga 

aattagttgactacattagtggtgtttag 



PROTEIN 

LELPLDHFRLIGVSPSATSEEILRAFQLRLDKTPDEGFTYEVLTQRSELLRLTADLLTDPDSRRDYENLLLNGAS 
GLDLS 

SNREVAGLILLWESGSSKEAFKITRKALQPPQTPALGSSREADLTLLAALTSRDAAIQEQDQRSYSNAADFLQEG 
IQLLQ 

RMGKLGELRKTLEEDLVS LLPYRI LDLLSRDLND YDSHKKGLSMLENL 1 1 KRGGLEGKNKSE YNDFLNQQEFES F 
FQQIK 

PFLTVQDQIDLFLELQKRGSSEAGFLAFLSLTAIGFARRKPAKLFEARKILKKLNLSGLDSMPLIGCLDLLLADV 
EQSSA 

RFLSSSDEKLRDWLNNYPGEKLEAICIFCKNWLENDVLVGYRDiDLKEIDLDSWFEDREIQEFIEQIEKKSNRTV 
FKSGP 

QNKPIFQAQESLKDSSTGPDLNSDNFEEGRLPLPGGVREDGQEVIEENIYTDEIIKNKSIEFYKYAIEKIAELKF 
VFGEA 

LENYRI FNKS S YLT YL YAFL I L FAFGLGVGFVRNNLKKPVQEKE 1 1 DNS LS INENKNVF YEGLNQDDKKKVLDNS 
KIILS 

DNAEKVIFSGEEIKTASPSLEKIENLINTWLVNKSKFLAGKGEINLSKIVQDDLIDRLKKERELDIQKGIYKNIN 
ANIEN 

IVLLTQTASRISVSVDLKYSEKILKIDGELINETTFTPFLKVKYILGFSNNSWKLVDYISGV* 
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DRAFT Prochlorococcus marinus sp . MIT9313 analysis files 

Produced for the Joint Genome Institute Microbial Sequencing program. 

N.B.: These pages subject to frequent change - work in progress. 

http : //genome . ornl . gov/cgi - 

bin/JGI_microbial/gene_viewer . cgi?org=pmar_mit&chr=18oct01&contig=Contig4 75 
&gene=2677 



Version 18oct01 - Contig475 Gene 2677 
Gene Finders 
Strand = f 

Stop Location = 398272 
Stop Codon = taa 

Gene Modeler Start Location 1 Start Codon 
Generation 396287 gtg 

Glimmer 396287 gtg 

Critica 396287 gtg 

MRNA 

gtggacctgccaatagatcatttccgcttgctgggtgtcagtccttcggcagacagtgaggcgattttgcgggcc 
ttgga 

gttgaggttggatcgctgccctgaccaaggtttcacccatgaggtcttaattcagcgggcagaattgttgcggct 
ttcag 

cagatttgctgactgatccgccacggcgtcaggcctatgagactgccttgttggagctcagtcgtgatcatccag 
gtgag 

accgccggtcttgatgtgtcacctagtagagaggtggcagggctgatcttgctgtttgaagcgaattcttctcat 
gaggt 

ttttcatctcgcctctcagggattgcaaccgccccagtccccgacgctaggtagcgaacgagaagctgacctcgc 
tttgt 

tgttggcactggcctgtcgggctgcagccgctgaggaacaggaacaacggcgttatgaagcagcagcgtctcttc 
tgcat 

gacgggatccagttgctgcagcggatgggcaagctctccgaagagtgccacaagcttgagaacgatttagatgcc 
cttct 

gccctatcgcattctcgacttattgagtcgggatcttggtgatcaggtttctcaccaggaaggactgcgcctact 
tgaca 

actttgtgagccagagaggaggtcttgagggaacggccccatcgcctgcacctggtggtcttgatcagtccgaat 
ttgac 

aacttcttcaagcagatcagaaagtttttaactgttcaggaacaggttgatcttttcctgcgctggcagcaagcc 
ggatc 

agcagatgcgggtttcctgggtgggttggctcttgctgctgttggattttcgcgtcggaagcctgaacgggtgca 
ggaag 

ctcggcagcacttagagaggcttcaactggatggatgcgacccgttgccgatgctgggttgcttggacctcttgc - 
tcgga 

gatgtgggccgcgctcaggagcgttttctgcgcagtacagatcctcgagtgaaggactgtcttaacagccaccct 
ggcga 

tgaattggctgctttttgtgagtactgccgctcttggctgcgaggggacgtgcttcccggttatagggatgtgga 
tgctg 

aggccgttgatctagaggcttggtttgctgatcgggatgttcaggcttatgtggagcgcctggaacgcagcgaaa 
atcgt 

gcttcttctttaggtaaggccttctcaggatcgtctgtgaagcaacccttcccttgggcgcctcttgatcccgat 
gggat 

tttgcccetctctcttggtgggcctgatgttggtcaacctgcagctgatcagagctctgatgagtttgccagcga 
tggta 
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tggcatggattgatcgtttagcagatctgccacgcccgacgcggccggtgctgatcggttcggttgtctttgcgg 
ccctg 

attgcagcctttgcaggcttcagtttgtttggccaacgtcctcgtacgtcagttagtacggctgctgatcagcct 
caagt 

cacagcacctcctacagccacactgcaagaggaggtcctcatgcctcaagtccctgtcagcgctgtggttgagcc 
gctta 

ctttggagcagccgaatgaggcacagctcaaaggcctgcttcaggcctggctcagcaacaaggcagtcgtgcttg 
ccggt 

ggcaagagtgatgcactgcctgaggtcgcaagagatccattggtgcagcgcgtggcgcaagagcgtgccagggat 
gctgc 

tttagctcagacccagaaggttgtggccagcatcagctctgtagaggtggtgagtcgaacgccgcagcgtattga 
gctga 

atgccgttgtgacctatcgcgatcaacgcgttgatgctgccggcaaggttgttgaccaaacgccccaaaaagatc 
tctcg 

gtgacttacatccttggtcgtgatcccgatcgttggcgcctgcatgaatacatcagcggcaaataa 



PROTEIN 

VDLPIDHFRLLGVSPSADSEAILRALELRLDRCPDQGFTHEVLIQRAELLRLSADLLTDPPRRQAYETALLELSR 
DHPGE 

TAGLDVSPSREVAGLILLFEANSSHEVFHLASQGLQPPQSPTLGSEREADLALLLALACRAAAAEEQEQRRYEAA 
ASLLH 

DGIQLLQRMGKLSEECHKLENDLDALLPYRILDLLSRDLGDQVSHQEGLRLLDNFVSQRGGLEGTAPSPAPGGLD 
QSEFD 

NFFKQIRKFLWQEQVDLFLRWQQAGSADAGFLGGLALAAVGFSRRKPERVQEARQHLERLQLDGCDPLPMLGCL 
DLLLG 

DVGRAQERFLRSTDPRVKDCLNSHPGDELAAFCEYCRSWLRGDVLPGYRDVDAEAVDLEAWFADRDVQAYVERLE 
RSENR 

ASSLGKAFSGSSVKQPFPWAPLDPDGILPLSLGGPDVGQPAADQSSDEFASDGMAWIDRLADLPRPTRPVLIGSV 
VFAAL 

IAAFAGFSLFGQRPRTSVSTAADQPQVTAPPTATLQEEVLMPQVPVSAWEPLTLEQPNEAQLKGLLQAWLSNKA 
WLAG 

GKSDALPEVARDPLVQRVAQERARDAALAQTQKWASISSVEWSRTPQRIELNAWTYRDQRVDAAGKWDQTP 
QKDLS 

VTYILGRDPDRWRLHEYI SGK* 
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Synechococcus sp. PCC7002 

>gnl | jmarq_32049 | Contig051302 -306 Synechococcus sp . PCC 7002 unfinished 
fragment of genome Length = 107169 

DNA: 

>Synechococcus sp . PCC7002 Contig051302 -306 position 55303.. 57453 reverse 
complement 

GTGCGCATTCCGCTCGACTATTACCGCATCCTATGCGTCCCCGCCAAGGCAACCACTGCCCAAATTACCCAAGCC 
TATCGCGATCGCCTCTCCCAATTTCCCCGTCGCGAACATAATGCCTTGGCCATTGAGGCCCGCAACCGGATTATC 
GAGCAAGCCTTTGAGGTGTTATCCCAAACAGAAACCCGCGCCGTCTACGACCATGAGCTGTCGGGCAATATGTTT 
CGTTCCCTCGTCCCCAGCCGTCCGAAACTGCCTTTTCCCGATCGCCCCTCCAGTGACACAGAGTTAGAAGCCCTG 
ACAGCCCACCAACCAACCATTGACATCGCGGAAAAAGATTTACTGGGGGGACTGCTGTTACTCCTCGACCTGGGG 
GAGTACGAATTAGTGCTGAAGTGGGCTGCCCCCTACCTCAAGGGCAAAGGCAAGCTGGTCAAGGAAGGGAAATTT 
GGGGCCGTCGAAATCGTCGAGCAAGAACTACGGCTTTGTTTGGCCCTGGCCCACTGGGAATTGAGCCGGGAACAG 
TGGCTCCAACAACATTATGAACAGGCGGCTCTCTCCGGTCAGAAGAGTCAAGAGCTATTGGTAGATGTGGCACAA 
TTTGCAGACCTCCAACAGGAAATTCAAGGGGATCTCAATCGCCTCAGACCCTATCAAGTTCTAGAACTTCTGGCC 
CTACCCGAATCAGAAACCCAAGAGCGACAACGGGGCTTACAACTGCTCCAGGAAATGTTGAGTGCTCGCGTGGGG 
ATTGATGGCCAGGGGGACGATCAGTCGGGTCTAAGTATTGATGATTTTTTGCGCTTTATCCAGCAGTTACGCAGT 
TATCTAACGGTGCAAGAACAGTTGGATCTCTTTGTGGCAGAATCAAAGCGACCTTCGGCGGCAGCGGCCTACCTA 
GCGGTGTATGCTCTCTTGGCTGCTGGGTTTTCGCAACGGAAACCTGACCTGGTCGTGCAAGCCCAGACCCTATTA 
AAACGCCTCGGCAAACGGCAGGATGTTTTCTTGGAGCAATCAATCTGCGCCTTACTTTTAGGTCAGCCGTCGGAA 
GCCAATCAACTGTTAGAACAAAGTCAGGAACAGGAGGCGATCGCCTACATTCAAGAGCAGTCTGAGGGGGCACCG 
GATCTACTCCCAGGCCTATGTCTCTACGGGGAACAGTGGCTGAAGACAGAGGTTTTTTCCCATTTCCGCGATCTC 
CGGCAACGGCTTGAAGATGGCTCTGTTTCGTTGACGGCTTACTTCGCCGATCCTGAAGTGCAGCAATATCTTGAC 
GATCTCCTCACGGAGGCTGTCCCCACACCCACACCACATCCAGACACAGAAAGTACAGCGGCCCCGTCGGAAAAG 
CCACCGGAAACATTACAGTCAGAAACCGGTGTTTCGCCGCATCCCAGTCGTCCCGCCAAGGTTGATTCCTTTGAG 
GATCTCGTCACTCAAACTCCCGCTACAGTTCCCCCGGCACCGCCTTCTCCTGGTGTAGCACCTGTAACTGCGGCA 
TTAAACCCAGACCCGGAAGCGTCTTCTGCTTCGTCAAAATCAGTTTCGTCAAAAAAGTCTATCGGGCCTTGGGGG 
GCGATCGCCGCTATCGTGGGGAGTGTTTTGCTGGTCGTGGGCCTGGTGCGAATTTTGTCTGGCCTAACTACCCAG 
GAACCCTTACAGGTCACCCTCAACGGTGAGCCACCCCTAACGATCCCCAGCTTAGACACCGCCGAGGCAAATAAT 
AATCCGGAGAATGGAGCGACCGATACAACGACAACGCCTGCGCTCAATGAGGCGATCGCCGCTGAGGTGATTCAA 
ACTTGGTTTGAGAGTAAAGCTAGAGCCTTTGGCCAAGACCGTGATTTGGCGGCTCTAGAAAATATTTTGGCAGAA 
CCGTCCCTGTCCCGCTGGCGCAGTAGTGCCCAGGCCGTCCGCAGCGCTGGTACCTACCGCACCTATGACCACAGT 
.TTGACCATTGAAACGGTGAGCTTCAACCCAGACCAACCCAATGTGGCGACCGTTGAGGCCCAGGTGCAGGAAAAG 
GCAGATTATTACCGGGCGAATGGGGAACGCGATCCCGGCCAGTCCTATGATTCTGACCTGCGTGTCCGCTACAGC 
TTGGTGCGCCAAGGCGATCGCTGGTTGATTCGTTCTTCCCAAACCCTGTAA 



Protein: 

>Scc_7002_Sequence 1 ORF:57453.. 55303 Frame -2 

MRIPLDYYRILCVPAKATTAQITQAYRDRLSQFPRREHNALAIEARNRIIEQAFEVLSQTETRAVYDHELSGNMF 
RSLVPSRPKLPFPDRPSSDTELEALTAHQPTIDIAEKDLLGGLLLLLDLGEYELVLKWAAPYLKGKGKLVKEGKF 
GAVEIVEQELRLCLALAHWELSREQWLQQHYEQAALSGQKSQELLVDVAQFADLQQEIQGDLNRLRPYQVLELLA 
LPESETQERQRGLQLLQEMLSARVGIDGQGDDQSGLSIDDFLRFIQQLRSYLTVQEQLDLFVAESKRPSAAAAYL 
AVYALLAAGFSQRKPDLWQAQTLLKRLGKRQDVFLEQSICALLLGQPSEANQLLEQSQEQEAIAYIQEQSEGAP 
DLLPGLCLYGEQWLKTEVFSHFRDLRQRLEDGSVSLTAYFADPEVQQYLDDLLTEAVPTPTPHPDTESTAAPSEK 
PPETLQSETGVSPHPSRPAKVDSFEDLVTQTPATVPPAPPSPGVAPVTAALNPDPEASSASSKSVSSKKSIGPWG 
AIAAIVGSVLLWGLVRILSGLTTQEPLQVTLNGEPPLTIPSLDTAEANNNPENGATDTTTTPALNEAIAAEVIQ 
TWFESKARAFGQDRDLAALENILAEPSLSRWRSSAQAVRSAGTYRTYDHSLTIETVSFNPDQPNVATVEAQVQEK 
ADYYRANGERDPGQSYDSDLRVRYSLVRQGDRWLIRSSQTL 




FIG. 8 continued 59/110 



linear BCT 18-OCT- 



LOCUS AF421196 2469 bp DNA 

2001 

DEFINITION Synechococcus sp. PCC 7942 cell division protein Ftn2 gene, 
complete cds . 
AF421196 

AF421196 . 1 GI: 1622 6083 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 



Synechococcus sp. PCC 7942. 
ORGANISM Synechococcus sp. PCC 7 942 

Bacteria; Cyanobacteria ; Chroococcales ; Synechococcus. 
1 (bases 1 to 2469) 
Koksharova, O . A. and Wolk/C.P. 

Two novel genes, one bearing a DnaJ motif, are involved in 



REFERENCE 
AUTHORS 
TITLE 

control 



JOURNAL 
REFERENCE 

AUTHORS 

TITLE 
■ JOURNAL 
State 

48824, 

FEATURES 

source 



CDS 



of cyanobacterial cell division 

Unpublished 

2 (bases 1 to 2469) 

Koksharova, O . A. and Wolk,C.P. 

Direct Submission 

Submitted (18-SEP-2001) Plant Research Laboratory, Michigan 
University, DOE Plant Research Laboratory, East Lansing, MI 
USA. 

Location/Qualifiers 
1. .2469 

/organism="Synechococcus sp . PCC 7942" 
/strain="PCC 7942" 
/db_xref ="taxon: 1140" 
319. .2214 
/codon_start=l 
/transl_table=ll 

/product= ,! cell division protein Ftn2 " 
/protein_id="AAL16071. 1" 
/db_xref = "GI : 16226084 " 



/ trans 1 at ion= "MRIPLDYYRILCVGVQASADKLAESYRDRLNQSPSHEFSELALQ 
ARRQLLEAAIAELSDPEQRDRYDRRFFQGGLEAIEPSLELEDWQRIGALLILLELGEY 
DRVSQLAEELLPDYDASAEVRDQFARGDIALAIALSQQSLGRECRQQGLYEQAAQHFG 
RSQS ALADHQRFPELSRTLHQEQGQLRP YR I LERLAQPLTADSDRQQGLLLLQAMLDD 
RQGIEGPGDDGSGLTLDNFLMFLQQIRGYLTLAEQQLLFESEARRPSPAASFFACYTL 
IARGFCDHQPSLIHRASLLLHELKSRMDVHIEQAIASLLLGQPEEAEALLVQSQDEET 
LSQIRALAQGEALIVGLCRFTETWLATKVFPDFRDLKERTAPLQPYFDDPDVQTYLDA 
IVELPSDLMPTPLPVEPLEVRSSLLAKELPTPATPGVAPPPRRRRRDRSERPARTAKR 
LPLPWIGLGWWLGGGTGVWAWRSRSNSTPPTPPPWQTLPEAVPAPSPAPVTVALD 
RAQAETVLQNWLAAKAAALGPQYDRDRLATVLTGEVLQTWQGFSSQQANTQLTSQFDH 



7? 
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KLtVDSVQLSDGDQRAWQAKVDEVEQVYRGDQLLETRRDLGLVIRYQLVRENNIWKI 

ASISLVR" 

BASE COUNT 493 a 712 c 712 g . 552 t 

ORIGIN 

1 cttgccgact aaaggctaag catcgccatt ccttagatta aagcagtctg tcggcggcgc 
61 tgtgccggtt aacaccagtc tgtcgctgac agcggtgcct ttctggggct tgcctgtggg 
121 gcgagtaacc gatcgctggg ataagagttg gtgcttctgg ctctcaagaa tagggttttc 
181 cgtcgcgtat tcccgatcac atccccctgt gtctgctacg gagataacgc cgatcactca 
241 acagaattgg taagttgacg gtcaagttgg gatgatgaag tcggctcaag ctggcgatcc 
301 ggatctggtg ggtgttctgt gcgtattcct ctcgattact accgaattct ctgtgttggc 
361 gtgcaagcct cggcagacaa acttgccgaa agctaccgcg atcgcctcaa ccaatcgccc 
421 tcccatgagt tttcagagct ggcattgcag gcgcggcggc aactcctcga agcagcgatt 
4 81 gctgagctga gtgatcccga acagcgcgat cgctacgatc gccgcttttt tcagggcggt 
541 ctggaagcga ttgaaccaag cctagaactc gaagactggc agcgaattgg agccctgctg 
601 atcctgctgg aattggggga atacgatcgc gtttcgcaac tggctgagga actcctgcca 
661 gactacgacg cgagcgcaga agtacgcgat cagttcgcgc cjgggtgatat cgccttggcg 
721 atcgcactat cccagcaatc cctcggtcga gaatgccgtc agcagggtct gtacgaacag 
781 gccgcccagc actttggccg cagccagtct gccctagccg atcatcagcg ctttcctgaa 
841 ctgagtcgaa ccctgcacca agaacaagga cagctacggc cctatcgcat tttggagcgg 
901 ttggcccagc ccttgactgc cgatagcgat cgccagcagg gtttgctgtt gttgcaggcg 
961 atgttggacg accggcaggg cattgaaggc cctggggatg atggctcggg gctgaccctt 
1021 gataactttt tgatgtttct ccagcaaatt cgcggctatc tgaccctggc tgaacagcag 
1081 ttgctgtttg aatcggaagc gcgtcggccc tcgccggctg cgagcttttt tgcctgctac 
1141 accctgattg cgcggggctt ttgcgatcac caaccctcgt tgatccatcg cgccagcttg 
1201 ctcttgcatg aactcaagag ccgcatggat gtgcacatcg aacaggcgat cgccagccta 
1261 ttgctcggac agcccgaaga agctgaggcg ctactcgtcc agagccaaga tgaggaaacc 
1321 ctcagccaaa tccgtgccct agcccaaggg gaagccctga tcgtcggttt gtgccgattc 
13 81 acggaaacct ggctagcgac caaggtattt ccggatttcc gcgacctcaa ggaaaggact 
1441 gcgccgctgc agccctactt tgacgacccc gatgtccaga cctatctgga tgcgatcgtg 
1501 gagttgccgt ccgatttgat gccaacgccg ctacccgttg agccgcttga ggtgcgatcg 
1561 tcgttgctgg ccaaggaact gccgacccca gcaacgcctg gtgtagctcc accccctcgc 
1621 cgccgtcgcc gcgatcgctc cgaacgtcct gctcgcacgg ccaaacgctt gcccttgccc 
1681 tggattggtt tgggggttgt ggtggttctc ggcggtggaa caggggtttg ggcttggcga 
1741 tcgcgttcca attccacccc gccgaccccg ccccccgtgg ttcaaacgct gcctgaggcg 
1801 gtacctgccc cttcgcccgc gccagttacc gttgccctcg atcgggctca ggctgaaact 
1861 gtgttgcaaa actggttggc cgctaaagct gcagccttgg ggcctcaata cgatcgcgat 
1921 cgcttagcga cggtgctgac cggtgaggtt ctgcagactt ggcagggttt ttctagccag 
1981 caggccaaca cccagctcac atcacagttc gatcacaagt taaccgtcga ctcagttcag 
2041 ctcagtgacg gtgatcaacg agcagtagtc caagccaagg tcgatgaagt tgagcaggtc 
2101 tatcgaggcg accagctgct cgaaacgcgc cgagatttgg gcttggtgat ccgctaccag 
2161 ctcgtgcgcg agaacaacat ctggaaaatt gcttcgatta gtttggtgcg ctaggaattc 
2221 gcaaggggtg aaccccctgc ggtcttttct gtagatcccc tagagcgatc gcagaatgtt 
22 81 cagcgattcc tggatgtgcg cttgggcatt caagagtgaa tcaaaaatgt ggcgcacctt 
2341 gccctctttg tcgatcacat aagtgacgcg acccggaatc acaaacaggg ttttgggcac 
2401 gccataggtt tgacggaggc gatcgcctgc atcgctcagc agttggaagg gcaagttgta 
2461 tttctgggc 

7/ 
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LOCUS 
2001 

DEFINITION 

ACCESSION 

VERSION 

DBSOURCE 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 

control 

JOURNAL 
REFERENCE 

AUTHORS 

TITLE 

JOURNAL 
State 

48824, 

COMMENT 
FEATURES 

source 



AF421196 1 



631 aa 



linear BCT 18-OCT- 



cell division protein Ftn2 [Synechococcus sp . PCC 7942] . 
AAL16071 

AAL16071.1 GI:16226084 

locus AF421196 accession AF421196.1 

Synechococcus sp . PCC 7942. 
Synechococcus sp . PCC 7942 

Bacteria; Cyanobacteria; Chroococcales ; Synechococcus. 

1 (residues 1 to 631) 
Koksharova , O . A . and Wolk,C.P. 

Two novel genes, one bearing a DnaJ motif, are involved in 

of cyanobacterial cell division 
Unpublished 

2 (residues 1 to 631) 
Koksharova, O. A. and Wolk,C.P. 
Direct Submission 

Submitted (18-SEP-2001) Plant Research Laboratory, Michigan 
University, DOE Plant Research Laboratory, East Lansing, MI 



USA. 
Method: 



Protein 



CDS 



conceptual translation supplied by author. 
Location/ Qualifiers 
1. .631 

/organism=" Synechococcus sp. PCC 7942" 

/strain="PCC 7942" 

/ db_xr e f = " t axon : 1 1 4 0 " 

1. .631 

/products "cell division protein Ftn2 " 
1 . . 631 

/coded_by="AF421196 .1:319.. 2214 " 
/transl table=ll 



ORIGIN 



1 mripldyyri Icvgvqasad klaesyrdrl nqspshefse lalqarrqll eaaiaelsdp 
61 eqrdrydrrf fqggleaiep sleledwqri gallillelg eydrvsqlae ellpdydasa 
121 evrdqfargd ialaialsqq slgrecrqqg lyeqaaqhfg rsqsaladhq rfpelsrtlh 
181 qeqgqlrpyr ilerlaqplt adsdrqqgll llqamlddrq giegpgddgs gltldnf lmf 
241 lqqirgyltl aeqqllfese arrpspaasf facytliarg fcdhqpslih raslllhelk 
301 srmdvhieqa iaslllgqpe eaeallvqsq deetlsqira laqgealivg lcrftetwla 
361 tkvfpdfrdl kertaplqpy fddpdvqtyl daivelpsdl mptplpvepl evrssllake 
421 lptpatpgva ppprrrrrdr serpartakr lplpwiglgv vwlgggtgv wawrsrsnst 
481 pptpppwqt lpeavpapsp 'apvtvaldra qaetvlqnwl aakaaalgpq ydrdrlatvl 
541 tgevlqtwqg fssqqantql tsqfdhkltv dsvqlsdgdq rawqakvde veqvyrgdql 
601 letrrdlglv iryqlvrenn iwkiasislv r 



76> 
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>gi | 17131676 | dbj | AP003590 . 1 | AP003590 Nostoc sp . PCC 7120 DNA, complete 
genome, section 10/19 Length = 333500 

nt 213526 . . 211130 

Frame = -2 

DNA: 

>AP003590 213526 . . 211130 reverse complement 

ATTATGTTGATCACGGTGCAGGGGAAGTACGCTGTGCGAATTCCGCTAGATTACTACCGAATTTTAGGGCTACCG 
TTAGCGGCAAGTGATGAACAACTGCGACAAGCATACAGCGATCGGATTGTCCAATTGCCGCGACGGGAGTATTCT 
CAAGCAGCAATTGCTTCCCGTAAACAACTTATAGAAGAAGCTTACGTGGTTTTATCAGATCCAAAGGAACGCAGC 
AGTTATGACCAGCTGTATCTTGCTCACGCCTACGACCCAGACAACGCGGCTACAACCAAAGTGGCAGTGGAAAAT 
CGTGGGGACAGCAACAATGGTCATTTCGATGTCCAAAGCCTGAGCATCGAAGTTTCCTCCGAGGAATTAATTGGT 
GCTTTATTAATTTTGCAAGAGTTGGGAGAGTATGAACTCGTACTCAAGTTAGGTCGTAATTACTTAGGTAATCAA 
AACGGCACAGCATCCACCAGAAATGGCAATCATCGCACGCCTGAAGAATTTCTCGATAGTTCTGAACGTCCAGAT 
ATTCTCTTGACTGTTGCTTTGGCCTCATTAGAATTAGGGCGGGAACAATGGCAACAAGGCCACTATGAAAACGCT 
GCTTTGTCTTTAGAGACTGGGCAAGAAGTGCTGTTTAGTGAAGGCATCTTCCCCAGCGTCCAGGCAGAAATTCAG 
GCTGATCTTTACAAATTACGCCCTTATAGAATTTTAGAATTACTTGCCTTACCCCAGGAAAAAACCATTGAACGC 
CACCAAGGGCTGGATCTATTACAAAGCATCTTAGACGATCGCGGTGGCATTGATGGTACAGGCAATGATCAATCA 
GGCTTAAACATTGATGACTTCCTCCGATTCATCCAGCAATTACGCCACCACTTAACAGTGGCTGAACAACATAAG 
TTGTTTGATGGTGAAAGCAAACGCCCTTCGGCTGTGGCTACATACTTAGCTGTTTATGCTTCCATCGCCAGAGGA 
TTCACCCAACGCCAGCCCGCTTTAATTCGTCATGCCAAGCAAATTCTGATGCGTTTGTCTAAGCGGCAAGATGTG 
CATTTAGAGCAGTCCCTGTGTGCGCTATTACTAGGGCAAACTGAAGAAGCCACGCGAGTTTTAGAACTGAGCCAA 
GAATACGAAGCTTTAGCCTTAATTCGAGAAAAATCTCAAGATTCACCCGATTTACTGCCAGGTTTGTGCTTATAT 
GCCGAACAATGGCTGCAAAATGAAGTTTTCCCCCATTTCCGCGATTTGTCCAGACAGCAAGCTTCCCTGAAAGAT 
TACTTTGCTAATCAACAAGTACAAGCGTATTTAGAAGCCTTGCCCAACGACGCGGAAACCACTAATGAATGGGCT 
GTAATTAACCGCCAATCGTTTTCTCAACCCAGGGGCAATTCTTACTCTGGAGGAACGCCAGTCGCCAAACGTCCC 
GTAGGGAAGGCGAACAGGCCAGGAGAAGCGTCCACAAGACCAGTTCCCCAACGTAGTCATCCATCAGAAGTAAAT 
CGGCAGTTTCATCAAAACAGAACCCCTGATCCCGAATTACCAGAAACATCAAACCACAGAAGACCAGAGTCTTCA 
AATTTTACAACTGCTAGAGAAAATATATCGACCACAGATGCTTACACTGACAATTATCCACCAGAGATCCCTGTA 
GAACGCGCCAGCAGACCTGTTCAGCCGGGGGTAAGTGGTTATACCCAATCGACCCCTCCACGGCAAACTCCTAAA 
CGCAGGAGACGCAAGAAGCCACAGGCAGTTGTCAACAGAGGACACAGTATTCATCAGCAACGCCAACCCTCACCT 
AGCACTCTAGGCCGGAAAACAAGATTACTTTGGATAGTTTTGGGTTCTTTGGGTGGGATATTATTGTTCTGGCTG 
ATAGTCTCAACGACTTTTGGGTGGTTAAAGAATGTATTCTTCCCAGCACCATCTTTACAAGGTGAGCAATTATCG 
ATTCAGATTAGTCAACCACCTTTAGAGATTCCTGACAAAAATGCCCAGATACAATCCCCAGAGGTGAGTCTCACA 
GAAGAAACGGCAAGGAAAATAATTGAAAATTGGTTGGCTACCAAAGCTAGTGCTTTAGGCGCTGAACATAAAATT 
GAGAGTTTAAACGAGATTTTAACTGGTTCAGCGTTATCTCAATGGCGGCTAATTGCCTTGCAAGATAAAGCAGAC 
AATCGTCATCGAGAATACAGTCATAGTGTCAAGGTAGACTCCATCAGTAAATCTGACATAGATCCCAATCGTGCA 
AGTGTGGGGGCTACAGTCAGAGAGTTAACCCAATTTTATGAGAATGGGCAAAAAGGGAAGTCTTCTGACGAAAGA 
TTACGTGTACGCTATGAATTGATTCGACAAGATGATATTTGGCGGATTCAGAGGATGTCAGCCGCTATAAATTAA 



Protein: 



LOCUS 
2001 

DEFINITION 

ACCESSION 

VERSION 

DBSOURCE 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 



BAB74406 



798 aa 



linear BCT 28 -NOV - 



ORF_ID:all2707~hypothetical protein [Nostoc sp . PCC 7120] . 
BAB74406 

BAB74406.1 GI:17131800 

locus AP003590 accession AP003590.1 

Nostoc sp. PCC 7120. 
Nostoc sp. PCC 7120 

Bacteria; Cyanobacteria ; Nostocales; Nostocaceae; Nostoc. 
1 

KanekO/T., Nakamura,Y., Wolk,C.P., KuritZ/T., Sasamoto^., 



77 
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Watanabe,A., Iriguchi , M . , Ishikawa,A., Kawashima , K . , Kimura,T. 
. Kishida, Y. , Kohara,]^., Matsumoto, M . , Matsuno,A. , Muraki,A., 
Nakazaki,N. , Shimpo,S., Sugimoto, M . , Takazawa,M., Yamada,M. , 
Yasuda,M. and Tabata,S. 
TITLE Complete genomic sequence of the filamentous nitrogen- fixing 

cyanobacterium Anabaena sp . strain PCC 7120 
JOURNAL DNA Res. 8 (5), 205-213 (2001) 
MEDLINE 21595285 
PUBMED 11759840 
REFERENCE 2 (residues 1 to 798) 
AUTHORS Kaneko,T. 
TITLE Direct Submission 

JOURNAL Submitted (02 -MAY-2001) Takakazu Kaneko, Kazusa DNA Research 
Institute, The First Laboratory for Plant Gene Research; Yana 
1532-3, Kisarazu, Chiba 292-0812, Japan 
(E-mail : kaneko@kazusa . or . jp , 
URL : http : / / www . kazusa .or.jp/ cyanobase/ , 
Tel: 81-43 8-52-3 93 5 (ex.233 8) , Fax:81-43 8-52-3934) 
Location/Qualifiers 
1. .798 

/organism="Nostoc sp . PCC 7120" 
/db_xref="taxon: 103690" 
/note=" synonym: Anabaena sp. PCC712 0" 
1. .798 

/name="ORF_ID:all2707 
hypothetical protein" 
CDS 1..798 

/gene="all2707" 

/coded_by=" complement (AP003590 . 1 : 211130 . .213526) " 
/transl table=ll 



FEATURES 

source 



Protein 



ORIGIN 



// 



1 mlitvqgkya vripldyyri lglplaasde qlrqaysdri vqlprreysq aaiasrkqli 

61 eeaywlsdp kerssydqly lahaydpdna attkvavenr gdsnnghfdv qslsievsse 
121 eligallilq elgeyelvlk lgrnylgnqn gtastrngnh rtpeefldss erpdilltva 

181 laslelgreq wqqghyenaa lsletgqevl fsegifpsvq aeiqadlykl rpyrilella 

241 lpqektierh qgldllqsil ddrggidgtg ndqsglnidd flrfiqqlrh hltvaeqhkl 

301 fdgeskrpsa vatylavyas iargftqrqp alirhakqil mrlskrqdvh leqslcalll 

361 gqteeatrvl elsqeyeala lireksqdsp dllpglclya eqwlqnevfp hfrdlsrqqa 

421 slkdyfanqq vqaylealpn daettnewav inrqsfsqpr gnsysggtpv akrpvgkanr 

481 pgeastrpvp qrshpsevnr qfhqnrtpdp elpetsnhrr pessnfttar enisttdayt 

541 dnyppeipve rasrpvqpgv sgytqstppr qtpkrrrrkk pqawnrghs ihqqrqpsps 

601 tlgrktrllw ivlgslggil lfwlivsttf gwlknvf fpa pslqgeqlsi qisqppleip 

661 dknaqiqspe vslteetark iienwlatka salgaehkie slneiltgsa lsqwrlialq 

721 dkadnrhrey shsvkvdsis ksdidpnras vgatvreltq fyengqkgks sderlrvrye 
781 lirqddiwri qrmsaain 



7% 
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LOCUS 
2001 

DEFINITION 
ACCESSION 
VERSION 
DBSOURCE 
KEYWORDS 
SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 



NP 486747 



798 aa 



linear BCT 2 8 -NOV- 



hypothetical protein [Nostoc sp. PCC 7120] . 
NP__486747 

NP_486747 .1 GI : 17230199 
REFSEQ: accession NC_003272.1 

Nostoc sp. PCC 7120. * 
Nostoc sp. PCC 7120 

Bacteria; Cyanobacteria ; Nostocales; Nostocaceae; Nostoc. 
1 

Kaneko,T., Nakamura , Y . , Wolk,C.P., Kuritz,T., Sasamoto, S . , 
Watanabe,A. , Iriguchi,M., Ishikawa,A., Kawashima, K. , Kimura,T., 
Kishida,Y., Kohara,M., Matsumoto, M . , Matsuno,A., Muraki,A., 
Nakazaki,N. , Shimpo,S., Sugimoto,M., Takazawa,M., Yamada,M., 
Yasuda,M. and Tabata,S. 

Complete genomic sequence of the filamentous nitrogen- fixing 
cyanobacterium Anabaena sp . strain PCC 712 0 
DNA Res. 8 (5), 205-213 (2001) 
21595285 
11759840 

2 (residues 1 to 798) 
Kaneko , T . 
Direct Submission 

Submitted (02 -MAY-2001) Takakazu Kaneko, Kazusa DNA Research 
Institute, The First Laboratory for Plant Gene Research; Yana 
1532-3, Kisarazu, Chiba 292-0812, Japan 
(E-mail : kaneko@kazusa . or . jp, 
URL:http: / /www . kazusa . or . jp/cyanobase/ , 
Tel : 81-438-52-3 93 5 (ex.233 8) , Fax:81-43 8-52-3 934) 
PROVISIONAL REFSEQ: This record has not yet been subject to 

NCBI review. The reference sequence was derived from BAB74406. 
Method: conceptual translation. 

Location/ Qualifiers 
1. .798 

/organism= "Nostoc sp . PCC 7120" 

/ db__xre f = " t axon : 1 0 3 6 9 0 » 
Protein 1 . .798 

/name="hypothetical protein" 
CDS 1. .798 

/gene="all2707" 

/coded__by=" complement (NC_003272 . 1 : 3300430 . . 3302826) " 
/transl table=ll 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 
final 



FEATURES 

source 



ORIGIN 



1 mlitvqgkya vripldyyri lglplaasde qlrqaysdri vqlprreysq aaiasrkqli 
61 eeaywlsdp kerssydqly lahaydpdna attkvavenr gdsnnghfdv qslsievsse 
121 eligallilq elgeyelvlk lgrnylgnqn gtastrngnh rtpeefldss erpdilltva 
181 laslelgreq wqqghyenaa lsletgqevl fsegifpsvq aeiqadlykl rpyrilella 
241 lpqektierh qgldllqsil ddrggidgtg ndqsglnidd flrfiqqlrh hltvaeqhkl 
3 01 fdgeskrpsa vatylavyas iargftqrqp alirhakqil mrlskrqdvh leqslcalll 
361 gqteeatrvl elsqeyeala lireksqdsp dllpglclya eqwlqnevfp hfrdlsrqqa 
421 slkdyfanqq vqaylealpn daettnewav inrqsfsqpr gnsysggtpv akrpvgkanr 
481 pgeastrpvp qrshpsevnr qfhqnrtpdp elpetsnhrr pessnfttar enisttdayt 
541 dnyppeipve rasrpvqpgv sgytqstppr qtpkrrrrkk pqawnrghs ihqqrqpsps 



if 
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601 tlgrktrllw ivlgslggil lfwlivsttf 

661 dknaqiqspe vslteetark iienwlatka 

721 dkadnrhrey shsvkvdsis ksdidpnras 

781 lirqddiwri qrmsaain 



gwlknvffpa pslqgeqlsi qisqppleip 
salgaehkie slneiltgsa lsqwrlialq 
vgatvreltq fyengqkgks sderlrvrye 
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DRAFT Nostoc punctiforme analysis files 

Produced for the Joint Genome Institute Microbial Sequencing program. 
N.B. : These pages subject to frequent change - work in progress. 
http://genome.ornl.gov/cgi- 

bin/ JGI_microbial/gene_viewer . cgi?org=npun&chr=31may01&contig=Contig493&gen 
e=84 

Version 31may01 - Contig493 Gene 84 
Gene Finders 
Strand = r 

Stop Location = 105061 
Stop Codon = TAA 

Gene Modeler Start Location Start Codon 

Generation 107367 GTG 

Glimmer 107367 GTG 

Critica 107367 GTG 

MRNA 

GTGCGAATTCCGCTAGATTACTACCGAATTTTAGGACTACCGTTAGCGGCAAGTGAAGAACAATTGCGACAGGCA 
TACAG 

CGATCGCATTGTACAATTGCCACGACGTGAGTATTCTCAGGCAGCAATTTCTTCTCGTAAACAACTCATAGAAGA 
AGCTT 

ACGTGGTTTTATCAGATCCAAAACAACGCAGTACCTACGATCAGCTTTATCTTGCCCACGCCTATGACCCTGATA 
ACCTT 

GCTGCTGCCGCAGTAGCACAGGAAAATCGTACAGAAAGCACCAAAAGGGGTAGTGATACCCAGAGTCTTGGTATA 
GAAAT 

TACCCAAGACGAATTAGTTGGCGCTTTATTAATTTTGCAAGAGTTGGGTGAATACGAACTTGTATTGAAACTAGG 
TCGTC 

CGTACCTAGTAAATAAAAATAGTGCTACAAGTTCAAGAAAAAGCAATAACTTAGCAGATGAAGAAATTTATGAAA 
GTGCT 

GAACACCCAGATGTCGTTCTCACTGTTGCTCTTGCCTGTCTAGAATTAGGTCGGGAACAGTGGCAGCAAGGTCAC 
TACGA 

AAATGCCGCCATATCCCTAGAAACTGGTCAAGAGCTGCTAGTACGTGAAGGTTTGTTCTCCAGTATCCAGGCAGA 
AATTC 

AGGCTGATCTTTACAAATTGCGGCCATATCGAATTTTGGAGTTGCTCGCATTACCTCAAGAAAAGACTGCCGAAC 
GAAGC 

CAAGGCTTAGAATTATTGCAAAATCTCTTAGAAGATCGTGGCGGGATTGATGGCACGAACAATGATGAATCGGGT 
TTAAA 

CATAGATGACTTTCTGCGATTTATCCAGCAGTTACGCAACCACTTAACAGTTGCAGAACAGCACAAGTTATTTGA 
AGCTC 

AAAGCAAACGTTCTTCTGCTGTTGCCACTTACTTAGCTGTTTATGCCTTGATAGCGCGAGGATTTGCTCAACGGC 
AACCT 

GCTTTAATTCGTCAAGCAAGACAAATGCTCGTGCGTCTGGGCAAGCGCCAAGATGTACATTTAGAACAGTCGCTA 
TGTGC 

CTTACTTTTGGGGCAAACTGAAGAAGCAACTCGTGTTTTAGAACTTAGTCAGGAGTACGAAGCTTTAGCTTTTAT 
TCGGG 

AAAAATCTCAGGACTCTCCAGATTTGTTACCGGGTCTGTGTTTATATGCAGAACAGTGGCTGCAACACGAAGTCT 
TTCCC 

CATTTTCGAGATTTAGCAAACCAGCAAGCTTTCCTAAAAGATTACTTTGCTAACCAACAGGTGCAAGCTTATTTA 
GAAGC 

ACTGCCAACTGATGCCCAAACAACTAATGAATGGGCTGTAATTAACCCCCAGTATTTTCCCCAGGCCAAGGCAAA 
GAATA 

CTCATTTTCATAACAATTCAACTAAAACTTCAGCGTCATTTAATCACAGCAGAGTACCTAACCCAGATTTGCCAG 
AAACA 
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CCAACAAAAGAAACCTCTGAATATCCAAACTTCTCACCACCTATGTGGAGTTCATCTGGAAGTATAAAATCAGAG 
GTTCC 

TGCTGCTGAAAGGATGAGCAGAGGTACTAATCAGCATTTGAACGGTTCAGCTAAGAGTGCTGCATCTGGTCATAA 
CCAAA 

AGCGTAGGCGGAGAAAACCTACTCCATCTGCTAGCCGAGAGCGTATACCAGATAATCGTCCTCATTCTCGTCGTC 
CCCGA 

AGGCGGCGAACTTTTGCGAACACCATAGAAGGTAAAACACGGCTGGTATGGAGAGTGTTTATTTCTTTGGTGAGC 
ATATT 

AGTTTTTTGGGTATTAGCCACAACAACTTTTGGATGGTTAAAAAATCTGTTTTTTCCTCAACCTTCTCCGCCTGA 
TCTAC 

AGTTGTTTGTACAAATAAACCAACCACGGTTACCTATTCCCGATCCAAATAGAAAACCAGAATCAGAAGAAGGCC 
CTTTA 

ACAAATGCAGAGGCAGAAGAAGTTATTCACACTTGGTTATCTACCAAAGCCGCAGCTTTAGGGCCCAATCATGAG 
ATTAA 

TAATTTAGAGCAAATTTTAACTGGTTCAGCTTTATCTCAATGGCGACTGATTGCTCAACAGAATAAGTTAGACAA 
TCGCT 

ACCGCAAGTTCGACCATAGTTTGAAGATAGAATCTGTTGAGAAAATTGGTTTATTTGCAGATCGTGCCGCAGTAG 
AAGCT 

ACGGTCAAAGAAGTGACGCAGTTATATGAAAATAATCAGTTTAAAAACTCTTCTAACGATAAATTAAGAGTTCGG 
TATGA 

CTTGATTCGAGAACGAGGTAAATGGCGTATTCAGAGTACATCTGTTGTAAATCAATTCACCAGATAA 
PROTEIN 

VRIPLDYYRILGLPLAASEEQLRQAYSDRIVQLPRREYSQAAISSRKQLIEEAYWLSDPKQRSTYDQLYLAHAY 
DPDNL 

AAAAVAQENRTESTKRGSDTQSLGIEITQDELVGALLILQELGEYELVLKLGRPYLVNKNSATSSRKSNNLADEE 
IYESA 

EHPDWLTVALACLELGREQWQQGHYENAAISLETGQELLVREGLFSSIQAEIQADLYKLRPYRILELLALPQEK 
TAERS 

QGLELLQNLLEDRGGIDGTNNDESGLNIDDFLRFIQQLRNHLTVAEQHKLFEAQSKRSSAVATYLAVYALIARGF 
AQRQP 

ALIRQARQMLVRLGKRQDVHLEQSLCALLLGQTEEATRVLELSQEYEALAFIREKSQDSPDLLPGLCLYAEQWLQ 
HEVFP 

HFRDLANQQAFLKDYFANQQVQAYLEALPTDAQTTNEWAVINPQYFPQAKAKNTHFHNNSTKTSASFNHSRVPNP 
DLPET 

PTKETSEYPNFSPPMWSSSGSIKSEVPAAERMSRGTNQHLNGSAKSAASGHNQKRRRRKPTPSASRERIPDNRPH 
SRRPR 

RRRTFANTIEGKTRLVWRVFISLVSILVFWVLATTTFGWLKNLFFPQPSPPDLQLFVQINQPPLPIPDPNRKPES 
EEGPL 

TNAEAEEVIHTWLSTKAAALGPNHEINNLEQILTGSALSQWRLIAQQNKLDNRYRKFDHSLKIESVEKIGLFADR 
AAVEA 

TVKEVTQLYENNQFKNSSNDKLRVRYDLIRERGKWRIQSTSWNQFTR* 
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>Synechocystis sp . strain PCC6803 D63999 : 2314780-2316924 complement 

GTGTTTATCCCCCTCGACTTTTATCGTATTTTAGGCATTCCTCCCCAGAGTGGTGGGGAA 

ACCATTGAGCAGGCCTACCAAGATCGCCTTTTACAATTACCCCGGCGAGAATTTAGTGAC 

GCCGCAGTTACTCTCCGCAATCAATTACTGGCGATCGCCTATGAAACCCTGAGGGATCCG 

GAAAAACGTCAGGCATACGACCAAGAATGGTGGGGAGCCATGGATGAAGCCCTGGGGGAG 

GCCTTACCCCTCACTACCCCGGAGTTGGAATGTAGCCCAGAGCAAGAAATTGGAGCCCTG 

TTGATCCTGTTGGATTTGGGGGAATACGAACTCGTGGTTAAGTATGGTGAGCCAGTACTC 

CACGATCCCAACCCTCCGGCGGGAGGCCTGCCCCAGGACTATTTGCTTTCGGTAATTTTG 

GCCCACTGGGAACTGAGCCGGGAACGTTGGCAACAACAGCAGTATGAATTTGCCGCCACC 

GCCAGTCTTAAGGCCCTAGCTCGGTTGCAACAGGATAATGACTTCCCCGCCTTGGAAGCA 

GAAATTCGTCAGGAACTATACCGTCTGCGACCCTACCGTATCCTCGAACTTTTGGCTAAG 

GAGGGGCAAGGGGAGGAGCAACGTCAGCAGGGTCTAGCTCTGTTGCAAGCGATGGTGCAG 

GACCGGGGCGGCATTGAAGGTAAGGGGGAAGATTATTCCGGATTGGGAAATGATGACTTT 

CTAAAATTCATCCACCAACTACGCTGTCACCTCACAGTGGCCGAGCAAAACGCCCTATTT 

TTGCCCGAAAGTCAACGGCCATCTTTAGTAGCAAGCTATTTGGCAGTACATAGTCTGATG 

GCTGAGGGAGTGAAGGAACAGGACCCCATGGCCATTGTCGAAGCAAAATCTTTGATTATA 

CAGTTGGAAAATTGTCAAGATTTGGCCCTAGAAAAGGTAATTTGTGAATTATTATTGGGT 

CAAACGGAAGTTGTTCTGGCGGCGATCGACCAGGGAGATCCGAAAATAGTAGCTGGCCTC 

GAATCTAAGTTAGCGACGGGGGAAGACCCCTTAACTGCTTTTTATACTTTCACTGAGCAG 

TGGCTAGAGGAAGAAATTGTCCCCTACTTTAGGGATCTTTCTCCGGAGACCCTTTCCCCC 

AAGGCCTATTTCAATAATCCCTCCGTTCAGCAGTATCTAGAACAACTAGAGCCGGATTCC 

TTCACCACTGACAATTCTTTTGCCTCCCCTGCCCTCCTTAGCACCGCAACGGAATCGGAA 

ACTCCCATGGTACATAGTTCCGCCGCCCTTCCCGATCGCCCTTTGACCTCCACCGTTCCC 

TCACGACGGGGACGCAGTCCAAGACGTTCCCGAGACGATGTTTTCCCCAGCGCCGACAAT 

TCCAGTGGTTTGGCCGTCACCACCCTATCTCCGGCGATCGCCTACGACACCCACTCCTTG 

GGCACCAACGGTATTGGCGGGGATAGCACTAGCAACGGTTTTTCCAGTAACTCCGCCCCA 

GAATCCACCAGTAAACATAAATCTCCCCGGCGACGCAAAAAACGGGTGACCATCAAGCCG 

GTGCGCTTCGGGATTTTTCTGCTTTGCCTAGCAGGCATTGTGGGGGGGGCAACTGCCCTA 

ATTATCAATCGTACTGGCGATCCCCTAGGTGGGTTGCTAGAAGACCCCCTAGATGTTTTC 

CTGGACCAACCTTCAGAATTTATCCCCGATGAAGCCACGAGCCGGAATTTGATTCTCAGT 

CAACCCAACTTCAATCAGCAAGTGGGTCAGATGGTAGTACAAGGCTGGCTTGATAGTAAA 

AAGTTAGCCTTTGGCCAAAACTACGATGTCGGGGCATTGCAGAGTGTTTTAGCCCCCAAT 

CTCCTTGCCCAACAACGGGGTCGGGCCCAACGGGATCAAGCCCAAAAGGTCTATCACCAA 

TACGAACACAAGTTGCAGATTTTAGCCTATCAAGTTAACCCCCAAGACCCCAACCGAGCC 

ACCGTTACTGCCCGGGTAGAAGAAATTAGCCAGCCCTTTACCCTAGGTAATCAACAGCAG 

AAGGGCTCCGCCACCAAAGATGACTTGACTGTGCGCTATCAGCTAGTACGACACCAAGGG 

GTTTGGAAAATTGACCAAATACAAGTGGTAAATGGCCCCCGTTAG 



LOCUS 
2001 

DEFINITION 
ACCESSION 
VERSION 
DBSOURCE 
KEYWORDS 
SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
Miyaj ima , N . 

TITLE 
cyanobacterium 



NP 441990 



714 aa 



linear BCT 23 -OCT - 



unknown protein [Synechocystis sp. PCC 6803] . 
NP_441990 

NP_441990 . 1 GI : 163312 62 
REFSEQ: accession NC__0 0 0911 . 1 

Synechocystis sp . PCC 6803. 
Synechocystis sp . PCC 6803 

Bacteria ; Cyanobacteria ; Chroococcales ; Synechocystis . 
1 (residues 1 to 714) 

Kaneko,T., Tanaka,A., Sato ,S., Kotani,H., Sazuka,T., 
Sugiura,M. and Tabata,S. 

Sequence analysis of the genome of the unicellular 



23 
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Mb 

JOURNAL 
MEDLINE 

REFERENCE 
AUTHORS 

Nakamura , Y . 



Synechocystis sp. strain PCC6803. I. Sequence features in the 1 

region from map positions 64% to 92% of the genome 

DNA Res. 2 (4), 153-166 (1995) 

96127529 

2 (residues 1 to 714) 

Kaneko,T., Sato ,5., Kotani,H. # Tanaka,A. , Asamizu,E., 



Miyaj ima, N. , Hirosawa,M., Sugiura,M., Sasamoto,S., Kimura,T 
Hosouchi,T., Matsuno , A. , Muraki,A., Nakazaki,N., Naruo,K (< 
Okumura,S., Shimpo,S., Takeuchi,C, Wada,T., Watanabe,A., 
Yamada,M., Yasuda,M. and Tabata, S. 
TITLE Sequence analysis of the genome of the unicellular 

cyanobacterium 

Synechocystis sp. strain PCC6803 



II . 



the 

regions 
JOURNAL 
MEDLINE 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



Sequence determination of 

entire genome and assignment of potential protein-coding 

DNA Res. 3 (3), 109-136 (1996) 
97061201 

3 (residues 1 to 714) 
Tabata, S . 
Direct Submission 
Submitted (28- JUN-1996) 



Satoshi Tabata, Kazusa DNA Research 
Institute, The First Laboratory for Plant Gene Research; Yana 
1532-3 , Kisarazu, Chiba" 292-0812, Japan 
(E-mail : tabata@kazusa . or . jp, 
URL:http: / /www . kazusa . or . jp/cyano/, 

Tel : 81 -43 8 -52 -3 93 3 (ex.2 330) , Fax:81-43 8-52-3 934) 
PROVISIONAL REFSEQ: This record has not yet been subject to 



COMMENT 
final 



FEATURES 

source 



Protein 



CDS 



NCBI review. The reference sequence was derived from BAA10060. 
Method: conceptual translation. 

Location/Qualif iers 

1..714 

/organism= "Synechocystis sp . PCC 6803" 
/db_xref="taxon: 1148" 
1..714 

/ name - "unknown protein" 
1. .714 

/gene="sll0169" 

/coded_by=" complement (NC_000911 . 1 : 2314780 . .2316924) " 
/transl table=ll 



ORIGIN 



1 mfipldfyri lgippqsgge tieqayqdrl lqlprrefsd aavtlrnqll aiayetlrdp 
61 ekrqaydqew wgamdealge alplttpele cspeqeigal lilldlgeye lwkygepvl 
121 hdpnppaggl pqdyllsvil ahwelsrerw qqqqyefaat aslkalarlq qdndfpalea 
181 eirqelyrlr pyrilellak egqgeeqrqq glallqamvq drggiegkge dysglgnddf 
241 lkfihqlrch ltvaeqnalf lpesqrpslv asylavhslm aegvkeqdpm aiveakslii 
301 qlencqdlal ekvicelllg qtewlaaid qgdpkivagl esklatgedp ltafytfteq 
3 61 wleeeivpyf rdlspetlsp kayfnnpsvq qyleqlepds fttdnsfasp allstatese 
421 tpmvhssaal pdrpltstvp srrgrsprrs rddvfpsadn ssglavttls paiaydthsl 
481 gtngiggdst sngfssnsap estskhkspr rrkkrvtikp vrfgifllcl agivggatal 
541 iinrtgdplg glledpldvf ldqpsefipd eatsrnlils qpnfnqqvgq mwqgwldsk 
6 01 klafgqnydv galqsvlapn llaqqrgraq rdqaqkvyhq yehklqilay qvnpqdpnra 
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661 tvtarveeis qpftlgnqqq kgsatkddlt vryqlvrhqg vwkidqiqw ngpr 

LOCUS BAA10060 714 aa linear BCT 04-JUL- 

2001 

DEFINITION ORF_ID: sll0169~unknown protein [Synechocystis sp . PCC 6803]. 
ACCESSION BAA10 060 

VERSION BAA10060.1 GI: 1001436 

DBSOURCE locus SYCSLRA accession D63999.1 

KEYWORDS 

SOURCE Synechocystis sp . PCC 6803. 

ORGANISM Synechocystis sp . PCC 6803 

Bacteria; Cyanobacteria; Chroococcales ; Synechocystis. 
REFERENCE 1 (residues 1 to 714) 

AUTHORS Kaneko,T., Tanaka,A., Sato ,3., Kotani,H., Sazuka,T., 
Miyaj ima,N. , 

Sugiura,M. and Tabata,S. 
TITLE Sequence analysis of the genome of the unicellular 

cyanobacterium '. ' 

Synechocystis sp. strain PCC6803 . I. Sequence features in the 1 



Mb 

JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

Nakamura , Y . 



region from map positions 64% to 92% of the genome 

DNA Res. 2 (4), 153-166 (1995) 

96127529 

8590279 

2 (residues 1 to 714) 

Kaneko,T., Sato, S . , Kotani,H., Tanaka, A. , Asamizu , E . , 



Miyajima,N., Hirosawa,M., Sugiura,M., Sasamoto,S., Kimura,T., 
Hosouchi,T., Matsuno,A. , Muraki,A., Nakazaki,N., Naruo,K., 
Okumura,S., Shimpo,S., Takeuchi,C, Wada,T., Watanabe,A., 
Yamada,M., Yasuda,M. and Tabata,S. 
TITLE Sequence analysis of the genome of the unicellular 

cyanobacterium 

Synechocystis sp. strain PCC6803 . II. Sequence determination of 

the 

entire genome and assignment of potential protein-coding 

regions 

JOURNAL DNA Res. 3 (3), 109-136 (1996) 
MEDLINE 97061201 
PUBMED 8905231 
REFERENCE 3 (residues 1 to 714) 
AUTHORS Taba t a , S . 
TITLE Direct Submission 

JOURNAL Submitted (3 0-AUG-1995 ) Satoshi Tabata, Kazusa DNA Research 

Institute, The First Laboratory for Plant Gene Research; Yana 
1532-3, Kisarazu, Chiba 292-0812, Japan 
(E-mail : tabata@kazusa . or. jp, 
URL : http : //www. kazusa . or . jp/cyano/ , 

Tel : 81-43 8-52-3 93 3 (ex.233 0) , Fax : 8 1 -43 8 - 52 - 3 934 ) 
COMMENT Potential protein coding regions were assigned on the basis of 

similarity search of the ORFs and GeneMark analysis. 
FEATURES Location/Qualifiers 
source 1 . . 714 

/organism="Synechocystis sp . PCC 6803" 
/strain="PCC6803" 



FIG. 8 continued 71/110 



/db_xref ="taxon: 1148" 

/note= " synonym : Synechocyst is PCC6 8 03 " 
Protein 1. . 714 

/name="ORF_ID:sll0169 

unknown protein" 
CDS 1. . 714 

/gene="sll0169" 

/coded_by="complement (D63999. 1:47521. .49665) " 
/ transl_table=ll 

ORIGIN 

1 mfipldfyri lgippqsgge tieqayqdrl lqlprrefsd aavtlrnqll aiayetlrdp 
61 ekrqaydqew wgamdealge alplttpele cspeqeigal lilldlgeye lwkygepvl 
.121 hdpnppaggl pqdyllsvil ahwelsrerw qqqqyefaat aslkalarlq qdndfpalea 
181 eirqelyrlr pyrilellak egqgeeqrqq glallqamvq drggiegkge dysglgnddf 
241 lkfihqlrch ltvaeqnalf lpesqrpslv asylavhslm aegvkeqdpm aiveakslii 
3 01 qlencqdlal ekvicelllg qtewlaaid qgdpkivagl esklatgedp ltafytfteq 
361 wleeeivpyf rdlspetlsp kayfnnpsvq qyleqlepds fttdnsfasp allstatese 
421 tpmvhssaal pdrpltstvp srrgrsprrs rddvfpsadn ssglavttls paiaydthsl 
481 gtngiggdst sngfssnsap estskhkspr rrkkrvtikp vrfgifllcl agivggatal 
541 iinrtgdplg glledpldvf Idqpsefipd eatsrnlils qpnfnqqvgq mwqgwldsk 
601 klafgqnydv galqsvlapn llaqqrgraq rdqaqkvyhq yehklqilay qvnpqdphra 
661 tvtarveeis qpftlgnqqq kgsatkddlt vryqlvrhqg vwkidqiqw ngpr 
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LOCUS AY074283 2857 bp mRNA linear PLN 26-APR- 

2002 

DEFINITION Arabidopsis thaliana unknown protein (At3gl9180) mRNA, complete 
cds . 

ACCESSION AY0742 83 

VERSION AY074283.1 GI: 18377659 

KEYWORDS FL I_CDNA . 

SOURCE thale cress. 

ORGANISM Arabidopsis thaliana 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; 
Tracheophyta; 

Spermatophyta; Magnoliophyta ; eudicotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 
REFERENCE 1 (bases 1 to 2 857) 

AUTHORS Yamada,K., Liu,S.X., Sakano,H., Pham,P.K., Banh,J., Chung, M.K., 
Goldsmith, A. D. , Lee,J.M., Quach, H.L., Toriumi,M., Yu,G., 

Bowser, L . , 

Carninci,P., Chen,}!., Cheuk,R., Hayashizaki , Y. , Ishida,J., 
Jones,T., Kamiya,A., Kar 1 in- Neumann, G . , Kawai,J., Kim,C, 

Lam, B . , 

Lin, J., Miranda, M., Narusaka,M., Nguyen, M., Palm, C. J., 

Sakurai , T . , 

Satou,M., Seki,M., Shinn,P., Southwick,A. , Shinozaki , K. , 
Davis, R.W., Ecker,J.R. and Theologis,A. 
Arabidopsis Full Length cDNA Clones 
Unpublished 
2 (bases 1 to 2857) 

Yamada,K., Banh,J., Chan,M.M., Chang, C.H., Chang, E., Dale, J. M. , 
Deng, J. M. , Goldsmith, A.D . , Lee,J.M., Onodera , C . S . , Quach, H . L . , 
Tang,C.C, Toriumi,M., Wu,H.C, Yamamura,Y., Yu,G., Bowser, L., 
Carninci,P., Chen,H., Cheuk,R., Hayashizaki , Y . , Ishida,J., 
Jones, T., Kamiya,A. , Karl in -Neumann, G . , Kawai,J., Kim,C, 

Lin, J., Meyers, M.C. , Miranda, M., Narusaka,M., Nguyen, M . , 

Sakurai, T., Satou,M., Seki,M., Shinn,P., Southwick, A. , 
Shinozaki , K. , Davis, R.W., Ecker,J.R. and Theologis,A. 
Direct Submission 

Submitted ( 11- JAN-2002 ) Plant Gene Expression Center, 800 
Street, Albany, CA 94710, USA 

RIKEN Genomic Sciences Center (GSC) members carried out the 
collection and clustering of RAFL cDNAs (RAFL cDNA : 'RIKEN 
Arabidopsis Full-Length cDNA'): Seki,M., Narusaka,M., 

Satou,M., Kamiya,A., Sakurai, T. , Carninci,P., Kawai,J., 
Hayashizaki , Y. and Shinozaki ,K. 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 



Lam, B . , 
Palm, C.J. 



TITLE 
JOURNAL 
Buchanan 

COMMENT 



Ishida, J. 



the 

Banh, J. 



The Salk, Stanford, PGEC (SSP) Consortium members carried out 

sequencing and annotation of the RAFL cDNAs : Yamada,K., 

Chan,M.M., Chang ,C.H., Chang, E., Dale, J. M., Deng, J. M., 
Goldsmith, A.D. , Lee,J.M., Onodera , C . S . , Quach, H . L . , Tang,C.C, 
Toriumi,M., Wu,H.C, Yamamura,Y., Yu,G., Bowser ,L., Chen,H., 
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Cheuk,R., Jones ,T., Karl in -Neumann, G . , Kim,C, Lam,B., Lin, J. , 
Meyers, M. C. , Miranda, M. , Nguyen,M., Palm, C. J., Shinn,P., 
Southwick, A. , Davis,R.W., Ecker,J.R. and Theologis,A. 

Yamada,K. (SSP/PGEC) and Seki,M. (RIKEN GSC) contributed 

equally to 

this work. Shinozaki,K. (RIKEN GSC) and Theologis,A. 

(SSP/PGEC) 

contributed equally to this work as Pis. 
FEATURES Location/Qualifiers 
source 1 . .2857 

/ or gani sm= " Arab i dop sis t ha 1 i ana " 
/ db_xr e f = " t axon : 3 7 0 2 » 
/chromosome="3 " 

/clone="RAFL09-57-L03 (R19126) " 

/note="This clone is in a modified pBluescript vector 

(FLC-1) as a BamHI/XhoI insert. 

ecotype : Columbia" 
gene 1 . .2857 

/gene= M At3gl9180 n 
5»UTR 1..134 

/gene="At3gl9180" 
CDS 135.. 2594 

/gene="At3gl9180" 

/codon_start=l 

/evidence=experimental 

/product= "unknown protein" 

/protein_id="AAL66980 .1" 

/db_xref="GI : 18377660" 

/ trans lation= "MPVAYTFPVLPSSCLLCGISNRSTSFWDRPELQISGLLWRSE 
S GE FFGS GL S LRRFQREGRRRLNAAGGG I HWDNAP S RTS S L AAS TS T I EL PVTC YQL 
I G VS EQ AE KD EWKS V I NL KKTD AE EG YTME AAAARQD L LMD VRD KLL F E S E Y AGNL K 
EKI APKS PLRI PWAWLPGALCLLQEVGQEKLVLD IGRAALRNLDS KP Y IHDI FLSMAL 
AECAIAKAAFEVNKVSQGFEALARAQSFLKSKVTLGKLALLTQIEESLEGLAPPCTLD 



LLGLPRTPENAERRRGAIAALRELLRQGLSVEASCQIQDWPCFLSQAISRLLATEIVD 
L L P WDD L A I T R KN KKS L E S HNQR W I D FNC F YM VL LGH I AVG F S G KQNE T I N KAKT I C 
ECLIASEGVDLKFEEAFCSFLLKQGSEAEALEKLKQLESNSDSAVRNSILGKESRSTS 
ATPSLEAWLMESVLANFPDTRGCSPSLANFFRAEKKYPENKKMGSPSIMNHKTNQRPL 
STTQFTOSSQHLYTAVEQLTPTDLQSPWSAKNNDETSASMPSVQLKRNLGVHKNKIW 
DEWLSQSSLIGRVSWALLGCTVFFSLKLSGIRSGRLQSMPISVSARPHSESDSFLWK 
TESGNFRKNLDSVNRNGIVGNIKVLIDMLKMHCGEHPDALYLKSSGQSATSLSHSASE 



LHKRPMDTEEAEELVRQWENVKAEALGPTHQVYSLSEVLDESMLVQWQTLAQTAEAKS 
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CYWRFVLLHLEVLQAHIFEDGIAGEAAEIEALLEEAAELVDESQPKNAKYYSTYKIRY 

ILKKQEDGLWKFCQSDIQIQK" 
misc_dif f erence 937 

/gene="At3gl9180" 

/note=" compared to genomic sequence resulting in an 

amino 

acid sequence difference" 

/replace="a" 
3 'UTR 2595 . .2857 

/gene="At3gl9180 M 
misc_d*iff erence 2 841 

/gene="At3gl9180" 

/note="not present in. genomic sequence" 
BASE COUNT 808 a 584 c 644 g 821 t 

ORIGIN 

1 actgtcaaaa ctcaaaagcc ttgagaccaa atttccgatt ttttctcctc tgaagaaatc 
61 caacaaattg taccatgatt ccagcttcac tctacttctt ctagggttcg ttcgttttct 
121 ggagctgttg cgcaatgcca gtagcttaca catttccagt tctcccttct tcttgtctgc 
181 tttgcggaat ctccaatcgc agcaccagct tcgtcgtaga tcgcccggag cttcagatct 
241 caggtctcct cgtcgttcgt tctgaatccg gtgaattctt cggttctggt ttatctttgc 
301 ggcggtttca gcgagaagga ' cggaggaggt tgaatgctgc tggtggtggt atccatgtcg 
361 tcgacaatgc gccgtctcgt acttcttctc tcgctgcatc tacctctaca atcgaactcc 
421 cggttacgtg ttaccagctt atcggagttt ctgagcaagc tgagaaagac gaggtcgtta 
481 agtcggttat aaatttgaaa aaaactgatg ctgaagaggg ttatacaatg gaagctgctg 
541 cagctcgcca ggatcttctc atggatgtta gggataaact tctttttgaa tcagaatatg 
601 ctggtaacct aaaagaaaag attgctccta aatctcctct cagaattccg tgggcatggt 
661 tgcctggtgc tctatgcctt cttcaagagg ttggacaaga aaaacttgtg ctggatattg 
721 gccgggctgc tctcaggaac cttgattcaa agccatatat tcatgatata ttcttatcta 
781 tggcacttgc tgagtgtgca attgccaagg ctgctttcga ggttaacaag gtctctcaag 
841 gatttgaagc tcttgctcgt gctcaaagtt ttctgaagag taaagttact cttgggaaac 
901 ttgcattgtt aactcagatt gaggagtcac tagaggggct tgcaccacct tgcacattgg 
961 atctactggg cctgccacgc acgccagaaa atgcagagag gaggcgaggt gcaattgccg 
1021 cgctacgcga actgctcaga cagggcctta gtgttgaagc ttcatgtcaa attcaagact 
1081 ggccatgctt tttgagccag gcaattagca ggttattggc cacagagatt gtcgatcttc 
1141 ttccatggga tgatttagcc attacacgga aaaataaaaa atcactggaa tcccacaatc 
1201 aaagagttgt tattgatttt aattgtttct acatggtgtt acttggtcac atcgctgttg 
1261 gattttcagg caagcaaaat gaaacgatta ataaagcaaa aacgatatgc gaatgtctca 
1321 tagcatcaga aggtgttgat ctgaaatttg aggaagcttt ttgctcattt cttctaaaac 
13 81 agggttccga ggcagaggcc ctggaaaaac ttaagcagct ggaatcaaat tcagactctg 
1441 ccgttcgtaa ttcgatcttg gggaaagagt cgagaagtac ttctgctact ccctcactgg 
1501 aagcgtggct aatggagtcc gtgcttgcta actttccaga cacaaggggt tgttctccat 
1561 ctttggccaa ttttttccgg gctgaaaaga aatatccaga aaacaagaaa atggggtcac 
1621 cttcgatcat gaatcataag acgaaccaaa gaccactttc cacaacacag ttcgtgaact 
1681 cgtcacaaca tctttataca gctgtcgagc agttgacacc aacagatttg cagagcccag 
1741 tggtatcagc caagaataat gatgaaacca gtgccagtat gccatctgtt caactgaaga 
1801 ggaaccttgg tgtacacaaa aataaaatat gggatgagtg gctctctcaa agcagtttga 
1861 tcggaagggt atctgttgtt gctttactgg gttgcaccgt gttcttctct ctgaagctat 
1921 caggcattag gtctggtaga ctacagagta tgcctatatc ggtttctgct aggccgcatt 
1981 cagaatcaga ttcttttctg tggaaaacag agtctgggaa tttcagaaaa aaccttgatt 
2 041 ctgtgaatag aaatggtatc gtgggaaaca tcaaagtgct cattgacatg ttaaagatgc 
2101 attgtggcga acatccggat gccctgtatc tgaaaagctc tggtcaatca gctacatcat 
2161 tgtctcattc tgcgtcagaa ctgcataaga gaccaatgga tacagaagaa gcggaagagc 
2221 ttgtgagaca gtgggaaaat gttaaggctg aagctcttgg accaacacat caagtttata 
2281 gcctttccga agtccttgat gaatccatgc ttgtccagtg gcaaacattg gcacaaacag 
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2341 
2401 
2461 
2521 
2581 
2641 
2701 
2761 
2821 



cagaggcgaa 
atatattcga 
cagcagaatt 
tccgatatat 
aaatacagaa 
atcaacagta 
ggaaagtctc 
tcttttgatt 
ttgtaaagcg 



atcctgttat 
agatggtatt 
agttgatgaa 
tctgaagaag 
gtgaaaatcc 
gaacatggga 
aggtttgttt 
tcaatgtgtt 
ttactgatca 



tggaggttcg 
gctggtgagg 
tctcagccca 
caagaagatg 
cccagaaaaa 
tcatttagct 
ctttattcct 
tatggataaa 
caaaaaaaaa 



ttctgcttca 
ctgcagaaat 
aaaacgcaaa 
gattgtggaa 
aaagctcatc 
aacggttgtt 
tagtaaccca 
caaacttctt 
aaaaaaa 



tcttgaggtt 
cgaagctctt 
atattatagc 
attctgccaa 
atctaactaa 
cttgtttacc 
caggatttgt 
gagtattttt 



ttgcaagcac 
ctggaggaag 
acttacaaga 
agcgatattc 
aggttgtagc 
taacggtgta 
ctttgtagat 
tttattatta 
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LOCUS AAL66980 819 aa 

2002 

DEFINITION unknown protein [Arabidopsis thaliana] . 
AAL66980 

AAL66980. 1 GI : 183 77660 
accession AY074283.1 



linear PLN 2 6 -APR 



ACCESSION 
VERSION 
DBSOURCE 
KEYWORDS 

SOURCE thale cress . 

ORGANISM Arabidopsis thaliana 

Eukaryota ; Viridiplantae ; Streptophyta ; Embryophyta ; 
Tracheophyta; 

Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 
REFERENCE 1 (residues 1 to 819) 

AUTHORS Yamada,K., Liu,S.X., Sakano,H., Pham,P.K., Banh,J., Chung, M.K. 
Goldsmith , A. D . , Lee , J . M . , Quach , H . L . , Toriumi , M . , Yu , G . , 

Bowser, L. , 

Carninci,P., Chen,H., Cheuk,R., Hayashizaki , Y . , Ishida,J., 
Jones ,T., Kamiya,A., Karlin-Neumann, G . , Kawai,J., Kim,C, 

Lam, B , , 

Lin, J., Miranda, M . , Narusaka,M., Nguyen,M., Palm,C.J., 

Sakurai,T. , 

Satou,M., Seki,M., Shinn,P., Southwick, A. , Shinozaki , K. , 
Davis, R.W., Ecker,J.R. and Theologis,A. 
Arabidopsis Full Length cDNA Clones 
Unpublished" 
2 (residues 1 to 819) 

Yamada,K., Banh,J., Chan,M.M., Chang,C.H., Chang, E., Dale, J. M. 
Deng, J. M., Goldsmith, A.D . , Lee,J.M., Onodera, C . S . , Quach, H.L., 
Tang,C.C, Toriumi, M., Wu,H.C, Yamamura,Y., Yu,G., Bowser, L . , 
Carninci,P., Chen,H., Cheuk,R., Hayashizaki , Y . , Ishida,J., 
Jones, T., Kamiya f A. f Karlin-Neumann, G . , Kawai,J., Kim,C, 

Lin, J., Meyers, M.C., Miranda,M., Narusaka,M., Nguyen, M., 

Sakurai,T., Satou,M., Seki,M., Shinn,P. f Southwick,A. , 
Shinozaki, K. , Davis,R.W., Ecker,J.R. and Theologis,A. 
Direct Submission 

Submitted (ll-JAN-2002) Plant Gene Expression Center, 800 
Street, Albany, CA 94710, USA 

RIKEN Genomic Sciences Center (GSC) members carried out the 
collection and clustering of RAFL cDNAs (RAFL cDNA : ' RIKEN 
Arabidopsis Full-Length cDNA ' )' : Seki,M., Narusaka,M., 

Satou,M., Kamiya,A., Sakurai,T., Carninci,P., Kawai,J., 
Hayashizaki, Y. and Shinozaki, K. 

The Salk, Stanford, PGEC (SSP) Consortium members carried out 

sequencing and annotation of the RAFL cDNAs : Yamada,K., 

Chan,M.M., Chang, C.H., Chang, E . , Dale , J . M . , Deng, J. M. , 
Goldsmith, A.D . , Lee,J.M., Onodera, C . S . , Quach, H.L., Tang,C.C, 
Toriumi, M. , Wu,H.C, Yamamura, Y. , Yu',G., Bowser , L ., Chen, H . , 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 



Lam, B . , 
Palm, C.J. 



TITLE 
JOURNAL 
Buchanan 

COMMENT 



Ishida, J. 
the 

Banh, J. , 
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equally to 
(SSP/PGEC) 



FEATURES 

source 



Protein 



CDS 



Cheuk,R., Jones, T., Karlin-Neumann, G . , Kim,C., Lam,B., Lin, J., 
Meyer s,M.C, Miranda,M., Nguyen, M. , Palm, C. J., Shinn,P., 
Southwick, A. , Davis, R.W., Ecker,J.R. and Theologis,A. 

Yamada,K. (SSP/PGEC) and Seki,M. (RIKEN GSC) contributed 

this work. Shinozaki,K. (RIKEN GSC) and Theologis,A. 

contributed equally to this work as Pis. 
Method: conceptual translation. 
Location/Qualifiers 
1. .819 

/organism= "Arabidopsis thaliana" 
/db_xref ="taxon: 3702" 
/ chr omo s ome ="3" 

/clone="RAFL09-57-L03 (R19126) " 

/note="This clone is in a modified pBluescript vector 
(FLC-1) as a BamHl/XhoI insert, 
ecotype : Columbia" 
1. -819 

/product = "unknown protein" 
1. .819 

/gene="At3gl9180" 

/coded_by="AY074283 . 1 : 135 . .2594 " 



ORIGIN 



1 mpvaytfpvl psscllcgis nrstsfwdr pelqisgllv vrsesgef f g sglslrrfqr 
61 egrrrlnaag ggihwdnap srtsslaast stielpvtcy qligvseqae kdewksvin 
121 lkktdaeegy tmeaaaarqd llmdvrdkll feseyagnlk ekiapksplr ipwawlpgal 
181 cllqevgqek lvldigraal rnldskpyih diflsmalae caiakaafev nkvsqgfeal 
241 araqsflksk vtlgklallt qieeslegla ppctldllgl prtpenaerr rgaiaalrel 
301 lrqglsveas cqiqdwpcfl sqaisrllat eivdllpwdd laitrknkks leshnqrwi 
361 dfncfymvll ghiavgfsgk qnetinkakt icecliaseg vdlkfeeafc sfllkqgsea 
421 ealeklkqle snsdsavrns ilgkesrsts atpsleawlm esvlanfpdt rgcspslanf 
4 81 fraekkypen kkmgspsimn hktnqrplst tqfvnssqhl ytaveqltpt dlqspwsak 
541 nndetsasmp svqlkrnlgv hknkiwdewl sqssligrvs wallgctvf fslklsgirs 
601 grlqsmpisv sarphsesds flwktesgnf rknldsvnrn givgnikvli dmlkmhcgeh 
661 pdalylkssg qsatslshsa selhkrpmdt eeaeelvrqw envkaealgp thqvyslsev 
721 ldesmlvqwq tlaqtaeaks cywrfvllhl evlqahifed giageaaeie alleeaaelv 
781 desqpknaky ystykiryil kkqedglwkf cqsdiqiqk 
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LOCUS 
2002 

DEFINITION 
ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



NC 003074 



23465812 bp 



DNA 



linear 



PLN 10- JAN - 



Arabidopsis thaliana chromosome 3, complete sequence. 
NC_003074 

NC_003074.2 GI: 18426881 
HTG. 

thale cress. 
Arabidopsis thaliana 
Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; 
Tracheophyta; 

Spermatophyta; Magnoliophyta ; eudicotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 
REFERENCE 1 . (bases 1 to 23465812) 

AUTHORS Town, CD., Haas, B. J., Wu, D , , Maiti , R. , Hannick, L . I . , Chan, A. P. 
Tallon, L.J. , Rooney,T. , Utterback, T . R . , VanAken,S.E. , 
Feldblyum, T . V . , White, O. and Fraser,C.M. 
Arabidopsis thaliana chromosome 3 genomic sequence 
Unpublished 

2 (bases 1 to 23465812) 
Town , C . D . and Kaul , S . 
Direct Submission 

Submitted (10- JAN-2002 ) The Institute for Genomic Research, 



TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

9712 



COMMENT 
final 



Medical Center Dr, Rockville, MD 20850, USA, cdtown@tigr.org 
PROVISIONAL REFSEQ: This record has not yet been subject to 

NCBI review. The reference sequence was derived from AE102093. 
On Jan 30, 2002 this sequence version replaced gi: 15228160. 
Address all correspondence to : at@tigr . org 



Gene 



Borodovsky , 



variant 



to 



protein 



SE/) 



Genes were identified by a combination of several methods: 

prediction programs including Genscan+ (Chris Burge, 
http://CCR-081.mit.edu/GENSCAN.html), GeneMarkHMM (Mark 

http://genemark.biology.gatech.edu/GeneMark/), GlimmerA (a 

of GlimmerM, see Mihaela Pertea, 

http : / /www. tigr . org/sof tlab/glimmerm__htm/glimmerm. html , and 
GeneSplicer (Mihaela Pertea and Steven Salzberg, contact 
mpertea@tigr.org) , searches of the complete sequence against a 
peptide database and the plant EST database at TIGR 
(http://www.tigr.org/tdb/tgi.shtml). Annotated genes are named 

indicate the level of evidence for their annotation. Genes with 
similarity to other proteins are named after the database hits. 
Genes without significant peptide similarity but with EST 
similarity are named as unknown proteins. Genes without 

or EST similarity, that are predicted by more than two gene 
prediction programs over most of their length are annotated as 
hypothetical proteins. Genes encoding tRNAs are predicted by 
tRNAscan-SE (Sean Eddy, http://genome.wustl.edu/eddy/tRNAscan- 



93 
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Simple repeats are identified by repeatmasker (Arian Smit, 
http : // ftp . genome . Washington . edu/RM/ RepeatMasker . html ) . 
FEATURES Location/Qualif iers 

source 1 . .23465812 

/ organism= "Arabidopsis thaliana" 
/cultivar= "Columbia" 
/ db_xr e f = " t axon : 3 7 0 2 " 
/ chromosome= " 3 " 



gene 



mRNA 



CDS 



6632806. .6639031 

/gene="At3gl9180" 
/note="IVIVI11.9; predicted 
join(<6632806. .6633108,66 



6633 599. . 6633 736 , 6633 812 . 
6634812. .6634907,6635016. 
6635728. .6636480,6636588. 
6637595. .663 7697,663 77 77. 
6638203. .6638365,6638457. 
6639021. .>6639031) 
/gene= n At3gl9180" 
/transcript_id="NM_112 805 
/db xref="GI : 18402148" 



by genscan+" 
33408. .6633521, 
. 6633916, 6634008 . 
.6635168,6635577. 
.6636778,6636865. 
.6637843,6638047. 
.6638663, 6638749. 



1" 



join(6632806. .6633108,6633408. .6633521,6633599. .6633736, 

\6634130, 6634812 . 
.6635642,6635728. 
.6636945, 6637595 . 
.6638104, 6638203 . 
.6638929,6639021. 



6633 812 
6635016 
6636588 
6637777 
6638457 



.6633916, 6634008 . 
.6635168,6635577. 
.6636778,6636865. 
.6637843,6638047. 
.6638663,6638749. 
/gene="At3gl9180" 
/codon_start=l 
/protein_id="NP_188549 . 1 ' 
/db xref="GI:15230315 ,f 



. 6634130, 
. 6635642, 
.6636945, 
.6638104, 
.6638929, 



. 6634907, 
.6636480, 
.6637697, 
.6638365, 
. 6639031) 
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Second Set 



dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 



12028705 
BJ258222 
BJ258222 
20081080 



CLONE INFO 
Clone Id: 
DNA type : 



whh6h02 (5') 
cDNA 



PRIMERS 
PolyA Tail : 



Unknown 



SEQUENCE 



GGCCGTCGGCAAATACTGCAGNTTGCACATGATACTCTCACAAACCAGAGCTCCCGCACC 
GAGTATGACCGCGCGCTCTCTGAGGACCGTGACGCGGCGCTCACACTGGATGTTGCTTGG 
GACAAGGTTCCGGGTGTGCTATGTGCCCTTCAGGAGGCTGGGGAGGCACAGGCAGTGCTT 
GCAATTGGAGAGCACTTACTGGAGGACCGCCCGCCCAAGCGGTTCAAGCAGGATGTGGTG 
CTGGCAATGGCGCTCGCTTATGTGGACATATCAAGGGATGCAATGGCGGCTAGCCCTCCA 
GATGTAATCCGCTGCTGTGAGGTGCTTGAAAGGGCTCTCAAGCTCTTGCAGGAGGATGGG 
GCAATCAACCTTGCACCTGGTCTGCTTTCACAAATTGATGAAACTCTGGAGGAGATCACA 
CCTCGTTGTGTTTTGGAGCTTCTTGCCCTTNCTCTTGATGAAAAACATCANATTGAACGC 



CANNAANGNNT 



Entry Created: Apr 8 2 0 02 
Last Updated: Apr 8 2 002 



LIBRARY 

Lib Name: Y. Ogihara unpublished cDNA library, Wh_h 

Organism: Triticum aestivum 

Cultivar: Chinese Spring 

Tissue type: spike at heading date 

Develop, stage: Feekes 1 scale 10.5 



SUBMITTER 



Institution : 
Address : 



Name : 
Lab: 



Tel: 
Fax: 



E-mail': 



Tadasu Shin-i 

Center For Genetic Resource Information 

National Institute of Genetics 

1111 Yata, Mishima, Shizuoka 411-8540, Japan 

81-559-81-6856 

81-559-81-6855 

tshiniogenes . nig .ac.jp 



CITATIONS 
Title : 
Authors : 



Expressed genes in Triticum aestivum. 
Ogihara, Y. , Murai,K. 
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Year: 2002 
Status : Unpublished 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 
CLONE INFO 
Clone Id: 
Source : 
DNA type : 
PRIMERS 
Sequencing : 
PolyA Tail : 



12455031 

GA Ed002 9A07f 

BQ410206 
21097893 

GA Ed002 9A07f ' 

CUGI 
CDNA 

T AAT ACG AC T C ACT AT AGGG 
Unknown 



SEQUENCE 



AATTGCAGAAGGCATTGTTCGCAAGTGGCAGAACATTAAATCTGAGGCGTTTGGACCTGA 
TCACCGCCTTGATAAATTGCCAGAGGTTCTGGATGGTCAAATGTTGAAGACATGGACAGA 
TCGTGCAGCCGAAATCGCTCAGCTTGGTTGGGTATATGAATATAGTCTACTGAACATGGC 



CATTGACAGTGTTACCCTTTCACTAGATGGCCAGCGAGCTGTAGTCGAAGCTACTCTGGA 



AGAATCCACCTGCTTGACTGATGTTCATCATCCGGAGAACAATGCCTCTAATGTAAACTC 
CTACACCACGAGATATGAGATGTCTTGTTCCAACTCAGGCTGGAAAATCACTGAAGGATC- 
TGTCTACAAATCTTAACTATGATGTATAAAGCATAAAAAGCCTGAAAGCTCCAATGTGGT 



TACCAGCTTTGCCTTTTTACGTAGCTATATTTGTTATATTGTTTGAGAAAACAAGAGTTA 



GCGTTTTCCAGTCATGCAAGCAGTTCAAATTAAAAGAGGCAATGCTTNTCATGGANAACN 



Quality: 
Entry Created: 
Last Updated: 
COMMENTS 

LIBRARY 
Lib Name : 
Organism: 
Strain : 
Cultivar : 
Tissue type: 
Lab host: 
Vector : 
R. Site 1: 
R. Site 2 : 



AAATG 

High quality sequence stops at base: 538 
May 22 2002 
May 22 2002 

Total High Quality bases = 521 

Gossypium arboreum 7-10 dpa fiber library 

Gossypium arboreum 

AKA 

8400 

Fibers isolated from bolls harvested 7-10 dpa 

E . coli 

pBK-CMV 

EcoRI 

Xhol 



SUBMITTER 
Name : 
Lab: 

Institution : 
Address: 
Tel : 
Fax : 



Wing RA 

Clemson University Genomics Institute 
Clemson University 

100 Jordan Hall, Clemson, SC 29634, USA 
864 656 7288 
864 656 4293 
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E-mail : 



rwing@clemson . edu 



CITATIONS 
Title: 

Authors : 

Year: 
Status : 



An integrated analysis of the genetics, development, and 
evolution of the cotton fiber 

Wing, R. A., Frisch,D., Yu,Y., Main,D., Rambo,T., Simmons, J. 

Henry # D., Wood # T.C, Leslie, A., Wilkins,T.A. 

2000 

Unpublished 
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dbEST Id: 
EST name: 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
Clone Id: 
DNA type : 

PRIMERS 
PolyA Tail : 



12551917 
AJ485537 
AJ485537 
21201492 



S0001100068E09F1 
cDNA 



Unknown 



SEQUENCE 

GATGAGCCCATACAGATTCCTAAAATGGATGCGAAGCTGGCAGAAGATATTGTTCGCAAG 



TGGCAGAGCATCAAATCCAAGGCCTTGGGATCAGATCATTCTGTTGCATCATTGCAAGAG 



GTTCTTGATGGCAACATGCTGAAGGTATGGACAGACCGAGCAGCAGAGATTGAGCGCAAA 



GGCTGGTTCTGGGACTACACGCTGTTCAACGTGGCGATCGACAGCATCACCGTCTCCCTG 
GACGGACGGCGGGCGACCGTGGAGGCGACAATTGAGGAGGCGGGTCAGCTCACCGACGCA 
ACCGACCCCAGGAACGATGATTTGTACGACACTAAGTACACCACCCGGTACGAGATGGCC 
TTCACCGGACCAGGAGGGTGGAAGATAACCGAAGGCGCAGTCCTCAAGTCGTCATAGGGC 



Entry Created: 
Last Updated: 



May 24 2002 
May 24 2002 



LIBRARY 
Lib Name : 
Organism: 
Develop, stage: 
Description: 



S00011 

Hordeum vulgare 
Developing seed 

12,15,18 days after pollination 



SUBMITTER 
Name : 
Lab: 

Institution : 
Address : 



Schulman AH 

Institute of Biotechnology 
University of Helsinki 

P.O.Box 56 (Viikinkaari 6A) , University of Helsinki 
FIN-00014, Finland 



CITATIONS 
Title: 
Authors : 
Year : 
Status : 



Barley EST ' s 

Saren , A. -M. , Tanskanen , J . 
2002 

Unpublished 



Paulin,L., Schulman, A. H. 
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dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 



12032032 
BJ263824 
BJ263824 
20084407 



CLONE INFO 
Clone Id: 
DNA type : 



whh6h02 (3') 
cDNA 



PRIMERS 
PolyA Tail : 



Unknown 



SEQUENCE 



CTGCAAATCTAGCACTATGTTTCTCTTTATCTCCAGGATCTAGCCTAGCACCAACAATCC 



AAATACAACACAAGAAAAATAAAGCTCTTCGTCGATCACATCAGACTAACGCAACTATCG 
GTCTTCCAAACTAAAAAGGGCCTAGACTGCCTGCTTATTTACACACCCCCAAAAGAAAAC 
TGGAAGGAATTAACAAACTTAATGAGGTTACCGCACACCAACTACCCTAAGACGACTTGA 
GGACCGCGCCTTCCATTATCTTCCACCCTCCTAGTCCGGTGAAGGTCATCTCATACCGGG 
TGGTGTACTTCGTGTCGTACGAGTCGTTGTTCTTGGGGTCGGTTGCGTCGATGAGCTGGC 
CTGCCTCCTCGATCGTTGCCTCCACGGTCGCCCGCCGTCCGTCCAGGGAGACCGTGATGC 
TGTCGATCGCCACGTCAGACAGTGTGTAGTCCCAGAACCAGCCTTTGCGCCCGATCTCCG 
CTGCTCGGTCCGTCCATACCTTCAGCATGTTGCCATCAAGAACCTCTTGCAATGATTCCA 
CAGAATGATCTGATCCCAAGGCCTTGGTTTTGATACTCTGCCACTTGCGAACAATATCTT 



CTGCCA 



Entry Created: Apr 8 2002 
Last Updated: Apr 8 2002 



LIBRARY 

Lib Name: Y. Ogihara unpublished cDNA library, Wh_h 

Organism: Triticum aestivum 

Cultivar: Chinese Spring 

Tissue type: spike at heading date 

Develop, stage: Feekes 1 scale 10.5 



SUBMITTER 



Institution : 
Address : 



E-mail : 



Name : 
Lab: 



Tel: 
Fax: 



Tadasu Shin-i 

Center For Genetic Resource Information 

National Institute of Genetics 

1111 Yata, Mishima, Shizuoka 411-8540, Japan 

81-559-81-6856 

81-559-81-6855 

tshini@genes . nig . ac . jp 



CITATIONS 
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Expressed genes in Triticum aestivum. 

Ogihara , Y . , Murai , K . 

2002 

Unpublished 
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dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 



12455032 

GA Ed0029A07r 



BQ410207 
21097894 



CLONE INFO 
Clone Id: 
Source : 



GA Ed0 02 9A07r 

CUGI 
CDNA 



DNA type : 



PRIMERS 
Sequencing : 
PolyA Tail : 



TAATACGACTCACTATAGGG 
Unknown 



SEQUENCE 



•pTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTAACTTGCCTCTTTT^ 

CTGCTTGCCTGACTGGAAAACCCTAACTCTTGTTTTCTCAAACAATTTAACAAATATAGC 
TCCCTAAAAAGGCAAAGCTGGTAACCACATTGGAGCTTTCAGGCTTTTTATGCTTTATAC 
ATCATAGTTAAAATTTGTAGACAGATCCTTCAGTGATTTTCCAACCTGAGTTGGAACAAA 
ACATCTCATATTTCGTGGGGTAGGAGTTTACATTACAGGCATTGTTCTCCGGATGATGAA 
CATTACTCAAGCCGGGGGGTTCTTCCAAAATAACTTCGACTACAGCTCGCTGGCCATTTA 
ATGAAAGGGTAACACTGTCAATGGCCCTGTTCAGTCAACTTTATTCATATACCCAACCCA 
GCTGACCGATTTCGGCTGCACCAACTGTCCATGTTTTCAACATTTGACCATCCAAAACCT 
TTGGCAATTTATCAAGGGGGGGATCAAGTCCAAACGCCTCAGATTTAATGTTCTGCCACT 



Quality: 
Quality : 



TGCGAACAATGCCTTTTGCAATT 

High quality sequence starts at base: 3 
High quality sequence stops at base: 554 



Entry Created: May 22 2002 
Last Updated: May 22 2002 



COMMENTS 



Total High Quality bases = 222 



LIBRARY 



R. Site 1: 
R. Site 2: 



Lib Name : 
Organism: 
Strain : 



Cultivar : 
Tissue type: 



Lab host: 
Vector : 



Gossypium arboreum 7-10 dpa fiber library 

Gossypium arboreum 

AKA 

8400 

Fibers isolated from bolls harvested 7-10 dpa 

E. coli 

pBK-CMV 

EcoRI 

Xhol 



SUBMITTER 
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Name : Wing RA 

Lab: Clemson University Genomics Institute 
Institution: Clemson University 

Address: 100 Jordan Hall, Clemson, SC 29634, USA 

Tel: 864 656 7288 

Fax: 864 656 4293 

E-mail: rwing@clemson.edu 



CITATIONS 
Title: 

Authors : 

Year: 
Status : 



An integrated analysis of the genetics, development, and 
evolution of the cotton fiber 

Wing, R. A., Frisch,D., Yu,Y., Main,D., Rambo,T., Simmons, J. 

Henry, D. , Wood,T.C, Leslie, A. , Wilkins,T.A. 

2000 

Unpublished 



/6h 
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dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 



12551919 
AJ485539 
AJ485539 
21201494 



CLONE INFO 
Clone Id: 
DNA type : 



S0001100117E11F1 
cDNA 



PRIMERS 
PolyA Tail: 



Unknown 



SEQUENCE 



GATGAGCCCATACAGATTCCTAAAATGGATGCGAAGCTGGCAGAAGATATTGTTCGCAAG 

TGGCAGAGCATCAAATCCAAGGCCTTGGGATCAGATCATTCTGTTGCATCATTGCAAGAG 

GTTCTTGATGGCAACATGCTGAAGGTATGGACAGACCGAGCAGCAGAGATTGAGCGCAAA 

GGCTGGTTCTGGGACTACACGCTGTTCAACGTGGCGATCGACAGCATCACCGTCTCCCTG 

GACGGACGGCGGGCGACCGTGGAGGCGACAATTGAGGAGGCGGGTCAGCTCACCGACGCA 

ACCGACCCCAGGAACGATGATTTGTACGACACTAAGTACACCACCCGGTACGAGATGGCC 

Entry Created: May 24 2002 
Last Updated: May 24 2002 



LIBRARY 
Lib Name : 
Organism: 
Develop, stage 
Description : 



S00011 

Hordeum vulgare 
Developing seed 

12,15,18 days after pollination 



SUBMITTER 
Name : 
Lab: 

Institution : 
Address : 



Schulman AH 

Institute of Biotechnology 
University of Helsinki 

P.O.Box 56 (Viikinkaari 6A) , University of Helsinki 
FIN-00014, Finland 



CITATIONS 
Title: 
Authors : 
Year: 
Status : 



Barley EST' s 

Saren,A.-M., Tanskanen, J . , Paulin,L., Schulman, A. H . 
2002 

Unpublished 



FIG, 8 continued 90/110 



dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 



12426231 
AJ463103 
AJ463103 
21062023 



CLONE INFO 
Clone Id: 
DNA type : 



S0000200015A03F1 
CDNA 



PRIMERS 
PolyA Tail : 



Unknown 



SEQUENCE 



TGATGGCAACATGCTGAAGGTATGGACAGACCGAGCAGCAGAGATTGAGCGCAAAGGCTG 
GTTCTGGGACTACACGCTGTTCAACGTGGCGATCGACAGCATCACCGTCTCCCTGGACGG 
ACGGCGGGCGACCGTGGAGGCGACAATTGAGGAGGCGGGTCAGCTCACCGACGCAACCGA 
CCCCAGGAACGATGATTTGTACGACACTAAGTACACCACCCGGTACGAGATGGCCTTCAC 
CGGACCAGGAGGGTGGAAGATAACCGAAGGCGCAGTCCTCAAGTCGTCATAGGGCGTTCA 
Entry Created: May 21 2002 

Last Updated: May 24 2002 " " 



LIBRARY 



Lib Name : 
Organism: 
Cultivar : 



S00002 

Hordeum vulgare 



Saana 
Embryo 



Develop, stage 
Description: 



1 day after pollination 



SUBMITTER 
Name : 
Lab: 

Institution: 
Address : 



Schulman AH 

Institute of Biotechnology 
University of Helsinki 

P.O.Box 56 (Viikinkaari 6A) , University of Helsinki 
FIN-00014, Finland 



CITATIONS 
Title: 
Authors : 
Year: 
Status : 



Barley EST 1 s 

Saren, A. -M. , Tanskanen, J. , Pauling. , Schulman, A. H . 
2002 

Unpublished 
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dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 



12172134 

WHE2493 E05 J09ZT 



BQ169059 
20315019 



CLONE INFO 
Clone Id: 
DNA type: 



WHE24 93_E05_J09 
cDNA 



PRIMERS 
Sequencing: 
PolyA Tail : 



T7 primer 
Unknown 



SEQUENCE 



TTTTTTTTTTTTTTTTTTTTTTTTTTTTCAGCGGCAAATTCAGCACTATGTTTCTCTTAT 
CCCCAACTCAAAGATCTTCTAAGCTAGCAATAATCCGAAAACGACACAGGGAAAAACAAA 
GCTCATCGCTGATTGCACATCAGACTAACCAAACTATCTCCAACTTCCAAACTGAGAAGG 
GCCTAGACTGCTTATTTACACACCAAAAAGAACACGGGAGGAATCAATCAACAAAGGTCT 
ACTGCACACCGAACGCCCTATGACGACTTGAGGACCGCACCTTCTGTTATCTTCCACCCT 
CCTGGTCCAGTGAAGGTCATCTCGTACCGGGTGGTGTACTTAGTGTCGTACAAATCGTTG 
TTCCTGGGGTCGGTTGCATCGGTAAGCTGGCCTGCCTCCTCAATTGTCGCCTCCACAGTC 
GCCCGTCGTCCGTCCAGGGAGACGGTGATGCTGTCAATCGCCACGTCGGACAGCGTGTAG 
TCCCAGAACCAGCCTTTGCGCTCGATCTCTGCTGCTCGGTCCCTCCATACCTTCAGCATG 



This EST was generated by sequencing from the 3' end of the 
clone. Sequences have been trimmed to remove vector 



and low quality sequence with phred score less than 20. 



TTGCCATCA 



Entry Created: 
Last Updated: 



Apr 25 2002 
Apr 25 2002 



COMMENTS 



sequence 



LIBRARY 



Description : 



Lab host : 
Vector : 



Lib Name : 
Organism: 
Cultivar : 



Tissue type: 
Develop, stage: 



R. -Site 1: 
R. Site 2: 



Triticum monococcum early reproductive apex cDNA library 

Triticum monococcum 

DV92 

Early reproductive apex 
Seven week-old plants 
E. coli XLOLR 

Lambda Uni-ZAP XR,' excised phagemid 

EcoRI 

Xhol 

The tissue, total RNA, and poly (A) RNA were prepared from 
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during 



SUBMITTER 
Name : 

Institution : 

Address : 
Tel: 
Fax : 
E-mail : 

CITATIONS 
Title: 



Authors : 



apex at double-ridge stage to terminal-spikelet stage 

transition from vegetative state to flower state, a cDNA 
library was made, and the cDNA clones were in vivo excised 
at the University of California, Davis (V. Echenique, B. 
Stamova, J. Dubcovsky) . Plasmid DNA preparations and DNA 
sequencing were performed in the OD Anderson lab (all other 
authors) . 

Olin Anderson 

US Department of Agriculture, Agriculture Research Service, 

Pacific West Area, Western Regional Research Center 

800 Buchanan Street, Albany, CA 94710, USA 

5105595773 

5105595818 

oandersnOpw . usda . gov 



The structure and function of the expressed portion of the 
wheat genomes - Early reproductive apex cDNA library from 
Triticum monococcum 

Anderson, O . D . , Chao,S., Dubcovsky , J . , Echenique , V . , 

, Hsia,C.C, Kang,Y., Lazo,G.R., Miller, R. , Rausch,C.J\, 

Seaton, C. L. , Stamova, B ., Tong , J . C . 

2001 

Unpublished 
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dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 



12506802 
BJ482132 
BJ482132 
21160594 



CLONE INFO 
Clone Id: 
DNA type : 



bah63kl0 (5') 
CDNA 



PRIMERS 
PolyA Tail : 



Unknown 



SEQUENCE 



GCGAGNAAGGACGAGNATCGTCAAGTCGGCCATCGAGCTGAGGAAATCGGAGATCGAAGA 

TGGGTACACGGAGGAGGTGTCCACCTGCAGACAGGCTCTGCTGCTGGACGTGAGAGACAA 

GCTTCTCTTTGAACAGGAGTACGCAGGAAGCACCAGGGCCAAGGTTCCGCCCAGATCCTC 

TCTTCATATACCCTGGAGCTGGTTGCCTGCTGCCTTGTGTGTCTTGCAGGAGGTTGGGGA 

AGAGAAGCTGGTCTTGGACATTGGTCAGGCAGCTCTACGACGCCCTGATTCTAAGCCATA 

TGCTCACGATGTACTTCTTGCAATGGCACTAGCTGAATGCTCCATTGCAAAAGCTAGCTT 

TGAAAAAAGTAAAGTATCTCTTGGCTTTGAGGCTCTAGCACGTGCTCAATATCTTTTGAG 

GAAAAAACCATCTTTAGAGAAGATGCCTCTTCTTGAGCAGATCGAAGAATCACTTGAAGA 

GCTTGCACCAGCTTGCACTCTAGAGGTTTTAAGCCTGCCCCGTACACCTGAAAATTCTGA 
ACGCAGGCGTGGTGCTATTGCAGCTCTCTGTGA 



Entry Created: 
Last Updated: 

LIBRARY 
Lib Name : 

heading 

Organism : 
Strain : 
Tissue type: 



May 23 2002 
May 23 2002 



K. Sato unpublished cDNA library, strain H602 adult, 

stage top three leaves 

Hordeum vulgare subsp. spontaneum 

H602 

top three leaves 



Develop, stage: adult, heading stage 



SUBMITTER 

Name: . Tadasu Shin-i 

Lab: Center For Genetic Resource Information 

Institution: National Institute of Genetics 

Address: 1111 Yata, Mishima, Shizuoka 411-8540, Japan 

Tel: 81-559-81-6856 

Fax: 81-559-81-6855 

E-mail: tshini@genes.nig.ac.jp 



CITATIONS 
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Title: Barley EST sequencing project in NIG and Okayama Univ 

Authors: Sato,K., Saisho,D w Takeda,K. 

Year: 2002 

Status: Unpublished 
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dbEST Id: 
EST name : 
GenBank Acc : 
GenBank gi : 

CLONE INFO 
Clone Id: 
DNA type : 

PRIMERS 
Sequencing : 
PolyA Tail : 

SEQUENCE 



12601756 

2 7-E0117 88-006-050-F04-T3 

BQ490457 

21335077 



F-4-50 
cDNA 



T3 ■ AATTAACCCTCACTAAAGGG ' 
Unknown 



GCATAACACGGCAAGAAGATGTTGCAGTTAATGGCTTTGGAAATGAGGATGTTACAATGG 

AGCTTGGCCGTGATAACACTTTAGATTATGTGAATTTAGCCAGTTCAAATTTTACTGAAG 

• ATAATATCGAGCAAGAATCGGTTACTGAGAAGATAAAAGATTTAGGTGTGAAGGTTATGT 

GTGCCGGTGTGGTGATTGGACTGACAACTTTGGCTGGCATGAAACTTTTGCCTGGCAGAA 

GTGGGTCTGCCATTCCACACAGGCATCTTGGTTCTGCTGTGGCTTCTGATGTCTCCAGTG 

TGGGGCTCTCAGTAAATGAAACTACTGAGGAGAAAGTACCAAAAATGGATGCAAGACTTG 
CAGAAGTTCTAGTTAGAAGATGGCAGAACGTTAAATCACA 
High quality sequence stops at base: 400 



Quality: 

Entry Created: 
Last Updated: 

LIBRARY 
Lib Name : 
Organism: 
Organ : 

Develop, stage: 
Vector : 
Description: 

cloning 



u 



Jun 7 2 002 
Jun 7 2002 



Sugar beet MPIZ -ADIS-006 Lambda Zap II library 

Beta vulgaris 

shoot and root 

4 week old pot-grown plants 

pBluescript SK- from lambda ZAP II 

cDNA (lambda ZAP- II) library from sugar beet, whole plant 
mRNA, Prepared using the Stratagene UniZAP cDNA kit, 

sites EcoRI-XhoI, primer sites and orientation: 
rev-T3-SacI-SK-EcoRI-GGCACGAGG-5pr-cDNA-polyA-XhoI-KpnI-T7 



ni 



SUBMITTER 
Name : 
Lab: 

Institution : 
Address : 
Fax: 
E-mail ? 



Weisshaar B 

ADIS DNA core facility at MPIZ 

Max-Planck-Institute for- Plant Breeding Research 
Carl-von-Linne Weg 10, 50829 Koeln, Germany 
00492215062851 
weisshaa@mpiz -koeln . mpg . de 



CITATIONS 
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Title: EST sequencing, annotation and macroarray expression 

analysis of more than 3000 sugar beet cDNAs identifies 

genes 

with root-specific expression pattern. 
Authors: Bellin,D., Werber , M . , Theis , T . , Weisshaar , B . , Schneider,K. 

Year: 2002 
Status : Unpublished 



/// 
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>gi | 22486832 | gb | BU046755 . 1 | BU04 6755 PP_LEa0027I04f Peach developing fruit 
mesocarp Prunus persica cDNA 

clone PP_LEa0027I04f . 
Length =631 

Score = 256 bits (653) , Expect = 7e-67 

Identities = 132/198 (66%) , Positives = 149/198 (75%), Gaps = 4/198 (2%) 
Frame = +1 

Query: 315 REKFMNEAFLRMTAAEQVDLFVATPSNIPAESFEVYEVALALVAQAFIGKKPHLXXXXXX 3 74 

RE FMNEAFL MTAAEQVDLFVATPSNIPAESFEVY VALALVAQAF +GKKPH 
Sbjct: 31 RENFMNEAFLHMTAAEQVDLFVATPSNIPAESFEVYGVALALVAQAFVGKKPHHIQDAEN 210 

Query: 3 75 XXXXXXXXX VMAME I PAML YDTRNNWE IDFGLERGLC ALL I GKVDECRMWLGLD S ED S Q Y 434 

V A+ Y T+ + EIDF LERGLC+LL+G +D+ R WLGLDS DS Y 

Sbjct: 211 LFQKLQQSKVTAVGHSLDNYITKESSEIDFALERGLCSLLLGDLDDSRSWLGLDSNDSPY 390 

Query: 435 RNPAIVEFVLENSNRDDNDD- LPGLCKLLETWLAGWFPRFRDTKDKKFKLGDYYDD 490 

RNP++V+FVLENS DD++D - LPGLCKLLETWL WFPRFRDTKD +F+LGDYYDD 
Sbjct: 3 91 RNPSVVDFVLENSKDDDDNDNDNDLPGLCKLLETWLMEVVFPRFRDTKDIEFRLGDYYDD 570 

Query: 4 91 PMVLSYLERVEWQGSPL 50 8 

P VL YLER++ GSPL 
Sbjct: 571 PTVLRYLERLDGTNGS PL 624 

LOCUS BU046755 631 bp mRNA linear EST 2 6-AUG- 

2002 

DEFINITION PP_LEa0027I04f Peach developing fruit mesocarp Prunus persica 
cDNA 

clone PP_LEa0027I04f , mRNA sequence. 
ACCESSION BU046755 

VERSION BU046755.1 GI:22486832 

KEYWORDS EST . 

SOURCE Prunus persica (peach) 

ORGANISM Prunus persica 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; 

Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons ; 

core eudicots; rosids 

; eurosids I; Rosales; Rosaceae; Amygdaloideae ; Prunus. 
REFERENCE 1 (bases 1 to 631) 

AUTHORS Callahan.A., Palmer, M., Main,D., Wing,R. andAbbott,A. 
TITLE Peach Model Genome for Rosaceae 

JOURNAL Unpublished 
COMMENT Contact: Abbott, A. 

Dept of Genetics and Biochemistry 
Clemson University 

122 Long Hall, Clemson University, Clemson, SC 29634, USA 
Tel: 864 656 3060 
Fax: 864 656 6879 
Email: aalbert@clemson.edu 
Total High Quality bases = 523 
Seq primer: TAATACGACTCACTATAGGG 
High quality sequence stop: 631. 
FEATURES Location/Qualifiers . ' 

. source 1 . . 631 



it* 
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/organism= "Prunus persica" 
/mol_type= "mRNA" 
/cultivar="Loring" 
/ db_xr e f = " t axon : 3 7 6 0 " 
/clone= M PP_LEa002 7I04f " 
/tissue_type="Mesocarp" 
/lab_host="E. coli" 

/clone_lib="Peach developing fruit mesocarp" 
/note= n Vector : pBluescript II SK(-); Site_l: EcoRI; 
Site_2 : Xhol; authori ty=Prunus persica L. Batsh; The 
sequence has been trimmed to remove vector sequence 



contains a minimum of 100 bases of phred value 20 or 
above. For more details on library preparation and 
sequence analysis go to 

http://www.genome.clemson.edu/projects/peach. To order 
this clone go to http : //www. genome . clemson. edu/orders" 



and 



BASE COUNT 
ORIGIN 

1 ( 



174 a 



123 c 



155 g 



178 t 



1 others 



61 
121 
181 
241 
301 
361 
421 
481 
541 
601 



gcagttgcaa ttgctggggg 
catatgactg cagctgagca 
gaaagctttg aagtttatgg 
aaacctcatc acattcaaga 
acagctgtag gacattctct 
gctttggaga ggggactctg 
ttgggcctag acagtaatga 
gagaactcaa aggatgacga 
ctattggaga cgtggttgat 
gagttcagac tgggagacta 
gatggcacta atggttcacc 



ngattcacta 
ggttgattta 
ggtggctctt 
tgctgaaaac 
tgacaactat 
ttcacttctt 
ttcaccatat 
tgacaatgac 
ggaggtggta 
ctatgatgat 
cttagctgct 



cgtgaaaatt tcatgaacga ggccttcttg 
tttgtagcta cccccagtaa tatcccggca 
gcgcttgttg ctcaagcctt tgttggtaaa 
ctattccaga aacttcagca gtctaaggta 
ataaccaaag aaagcagtga gatagacttt 
ctaggggacc ttgatgacag tcgttcgtgg 
agaaatccat ctgttgtaga ctttgtcttg 
aatgacaatg atcttcctgg actttgcaag 
ttccccaggt ttagagacac caaagacata 
cctacagtct tgagatactt agaaaggctg 
9 



// 
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>gi|22471250|gb|BU035730.l|BU035730 QHJ7N08 . yg . abl QH_EFGHJ sunflower 
RHA2 80 Helianthus annuus cDNA 
clone QHJ7N0 8 . 
Length =64 7 

Score = 178 bits (451) , Expect = 2e-43 

Identities = 96/178 (53%), Positives = 122/178 (68%), Gaps = 3/178 (1%) 
Frame = +1 



Query: 627 GLISLFSQKYFLK- - -SSSSFQRKDMVSSMESDVATIGSVRADDSEALPRMDARTAENIV 683 

GL++L K+ S + S + RK++ S+ + SDV + R +D+E +P+MDAR AE +V 

Sbjct : 16 GLMTLAGLKFIPS*TGSTSTTARKEVDSALASDVTNVEDSRVEDAEDIPKMDARLAEGLV 195 

Query: 684 SKWQKIKSLAFGPDHRIEMLPEVLDGRMLKIWTDRAAETAQLGLVYDYTLLKLSVDSVTV 74 3 

KWQ IKS A GP+H L VLDG M KIW RA E AQ G +DYTLL +++DSVTV 

Sbjct: 196 RKWQSIKSQALGPEHCHSKLS*VLDGEMHKIWLQRATEIAQRGWFWDYTLLNITIDSVTV 375 

Query: 744 SADGTRALVEATLEESACLSDLVHPENNATDVRTYTTRYEVFWSKSGWKITEGSVLAS 8 01 

S DG A+VEATLEESA L DL HPENN + TYTTRYE+ +KS WKIT+G+VL S 
Sbjct: 376 S LDGRLAWE ATLEE S AKL I DLTHPENND S YNLT YTTRYEMS CAKS S WKI TKGAVLKS 549 

LOCUS BU03573 0 647 bp .mRNA linear EST 23-AUG- 

2002 

DEFINITION QHJ7N08.yg.abl QH_EFGHJ sunflower RHA280 Helianthus annuus cDNA 

clone QHJ7N08, mRNA sequence. 
ACCESSION BU035730 

VERSION BU035730.1 GI:22471250 

KEYWORDS EST . 

SOURCE Helianthus annuus (common sunflower) 

ORGANISM Helianthus annuus 

Eukaryota ; Vir idiplant ae ; S t reptophyt a ; Embryophyta ; 
Tracheophy ta ; 

Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicots; 

asterids; campanulids; Asterales; Asteraceae; Asteroideae; 

Heliantheae; Helianthus. 
REFERENCE 1 (bases 1 to 64 7) 

AUTHORS Kozik^., Michelmore , R . W . , Knapp/S., Matvienko , M . , 
Rieseberg, L. , . 

Lin,H., van Damme ,M., Lavelle/D., Chevalier , P . , Ziegle^., 

Ellison 

, P., Kolkma^J., Slabaugh, M . S . , Livingston^., Zhou,Y. 7 Lai,Z., 

Church^., Jackson, L. and Bradford, K. 
TITLE Lettuce and Sunflower ESTs from the Compositae Genome Project 

http : / /compgenomics .ucdavis . edu/ 
JOURNAL Unpubl i shed 
COMMENT Contact: Alexander Kozik [R . W . Michelmore] 

Department of Vegetable Crops, R.W. Michelmore Lab 

University of California at Davis (UCD) 

Asmundson Hall, UCD, Davis, CA 95616, USA 

Tel: 1- (530) -742-1742 

Fax: 1- (530) -752-9659 

Email: akozik@atgc.org [michelmore@vegmail.ucdavis.edu] 
belongs to contig QH_CA_Contig4 3 96 , see 
http://cgpdb.ucdavis.edu/ 
for details. 



ft? 
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Plate: QHJ7 row: N column: 08. 
FEATURES Location/Qualifiers 
source 1 . . 647 

/organism="Helianthus annuus" 

/ mo l_t ype = " mRNA " 

/ cul t i var = " RHA2 8 0 " 

/ db_xre f = " t axon : 4 2 3 2 " 

/clone= n QHJ7N08" 

/lab__host="E.coli" 

/clone_lib="QH_EFGHJ sunflower RHA280" 
/note= "Vector : pBRcDNAS f i AB ; The library was 

constructed 

from 11 different sources of RNA from a single 

genotype. 

Separate cDNAs were generated using primers that 
incorporated unique 5' and 1 3' tags to distinguish each 
source of RNA. cDNAs were then pooled, size- 
fractionated, 

directionally cloned into a custom medium-copy vector 

and 

transformations made with four size classes to 

minimize 

size bias. Details of each source of RNA and library 

construction can be obtained at 
http : / / cgpdb .ucdavis . edu/ 

TAG_L I B = QH_E FGH J sunflower RHA2 8 0 

TAG_TISSUE=germinating seeds 

TAG_SEQ=TCTGTGCGGG " 
BASE COUNT 181 a 133 c 145 g 188 t 

ORIGIN 

1 cagaaagagg tggctggatt gatgactttg gctggcttga aatttatacc gtcttaaaca 
61 ggctctacta gtactactgc tcgtaaagaa gttgattcgg ctctggcttc agacgtcacc 
121 aatgtggagg attctagggt tgaggatgct gaagacattc ctaaaatgga tgcaagatta 
181 gccgaaggtc tagttcgtaa gtggcagagc ataaaatccc aagcccttgg acctgagcat 
241 tgccactcaa aattatcata ggtattagat ggtgaaatgc acaagatctg gcttcaacgg 
301 gcaaccgaaa ttgctcaacg tggttggttt tgggactaca cgcttttaaa cattaccatt 
361 gacagtgtta ccgtttcact cgatgggcgc ttagctgttg tggaagcaac ccttgaagag 
421 tctgccaagt tgattgattt gacccacccg gaaaacaatg actcctataa tttaacttac 
481 accacacgtt atgagatgtc gtgtgccaag tcatcatgga aaatcacaaa gggggctgtc 
541 ctcaaatcat aacagatgta attctttctc accttttctg tatttatctg ttattagatt 
601 actcagcagt tgaatgatat gtttctccac catttcgatc atgagcg 
// " 
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>gi|22394580|gb|BQ977057.l|BQ977057 QHI2 3M11 . yg . abl QH_ABCDI sunflower 
RHA8 01 Helianthus annuus cDNA 
clone QHI23M11. 
Length = 652 

Score = 166 bits (421) , Expect = 5e-40 

Identities = 85/138 (61%), Positives = 101/138 (73%) 

Frame = +1 



Query: 664 RADDSEALPRMDARTAENIVSKWQKIKSLAFGPDHRIEMLPEVLDGRMLKIWTDRAAETA 723 

R +D+E +P+MDAR AE +V KWQ IKS A GP+H L EVLDG M KIW RA E A 
Sbjct: 127 RVEDAEDIPKMDARLAEGLVRKWQSIKSQALGPEHCHSKLSEVLDGEMHKIWLQRATEIA 306 

Query: 724 QLGLVYDYTLLKLSVDSVTVSADGTRALVEATLEESACLSDLVHPENNATDVRTYTTRYE 783 

Q G +DYTLL +++DSVTVS DG A+VEATLEESA L DL HPENN + TYTTRYE 
Sbjct: 307 QRGWFWDYTLLNITIDSVTVSLDGRLAWEATLEESAKLIDLTHPENNDSYNLTYTTRYE 486 

Query: 784 VFWSKSGWKITEGSVLAS 801 

+ +KS WKIT+G+VL S 
Sbjct; 4 87 MS CAKS S WKI TKGAVLKS 54 0 



LOCUS BQ977057 652 bp mRNA linear EST 21 -AUG- 

2002 

DEFINITION QHI23Mll.yg.abl QH_ABCDI sunflower RHA801 Helianthus annuus 
cDNA 

clone QHI23M11, mRNA sequence. 
ACCESSION BQ977057 

VERSION BQ977057.1 GI:22394580 

KEYWORDS EST. 

SOURCE Helianthus annuus (common sunflower) 

ORGANISM Helianthus annuus 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; 
Tracheophyta ; 

Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicots; 

asterids; campanulids; Asterales; Asteraceae; Asteroideae ; 

Heliantheae; Helianthus. 
REFERENCE 1 (bases 1 to 652) 

AUTHORS Kozik,A., Michelmore , R . W . , Knapp,S., Matvienko , M . , 
Rieseberg, L . , 

Lin,H. # van Damme, M., Lavelle^., Chevalier , P . , Ziegle^J., 

Ellison 

, P., Kolkman,J., Slabaugh, M.S., Livingston, K . , Zhou,Y., Lai ,Z. , 

Church, S., Jackson, L. and Bradford, K. 
TITLE Lettuce and Sunflower ESTs from the Compositae Genome Project 

http : //compgenomics .ucdavis . edu/ 
JOURNAL Unpubl i shed 
COMMENT Contact: Alexander Kozik [R . W . Michelmore] 

Department of Vegetable Crops, R.W. Michelmore Lab 

University of California at Davis (UCD) 

Asmundson Hall, UCD, Davis, CA 95 616, USA. 

Tel: 1- (530) -742-1742 

Fax: 1- (530) -752-9659 

Email: akozik@atgc.org [michelmore@vegmail.ucdavis.edu] 



if* 
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belongs to contig QH_CA_Contig4 3 96 , see 
http : / /ogpdb . ucdavis . edu/ 
for details. 

Plate: QHI23 row: M column: 11. 



FEATURES 

source 



constructed 



genotype . 



fractionated, 



and 



minimize 



Location/ Qualifiers 
1..652 

/organism="Helianthus annuus" 
/mol_type= "mRNA" 
/cul t ivar= "RHA8 01" 
/db_xref = " taxon : 4232 " 
/clone= n QHI23Mll n 
/lab_host="E. coli" 

/clone_lib="QH_ABCDI sunflower RHA801" 

/ note=" Vector : pBRcDNASf iAB; The library was 

from 11 different sources of RNA from a single 

Separate cDNAs were generated using primers that 
incorporated unique 5 1 and 3 1 tags to distinguish each 
source of RNA. cDNAs were then pooled, size- 

directionally cloned into a custom medium-copy vector 

transformations made with four size classes to 



- size bias. Details of each source of RNA and library 

construction can be obtained at 
http : / /cgpdb . ucdavis . edu/ 

TAG_L I B = QH_AB CD I sunflower RHA801 

TAG_TISSUE=germinating seeds 

TAG_SEQ=TCTGTGCGGG" 
BASE COUNT 178 a 135 C 148 g 191 t 

ORIGIN 

1 tgtggtggtt ggattgatga ctttggctgg cttgaaattt acaccgtcca aaagaggctc 
61 tactagtact actgctcgta aagaagttga ttcggctctg gcttcagacg tcaccaatag 
121 gattctaggg ttgaggatgc tgaagacatt cctaaaatgg atgcaagatt agccgagggt 
181 ctagttcgta agtggcagag cataaaatcc caagcccttg gacctgagca ttgccactca 
241 aaattatcag aggtattaga tggtgaaatg cacaagatct ggcttcaacg ggcaaccgaa 
301 attgctcaac gtggttggtt ttgggactac acgcttttaa acattaccat tgacagtgtt 
361 accgtctcac tcgatgggcg cttagctgtt gtggaagcaa cccttgaaga gtctgccaag 
421 ttgattgatt tgacccaccc ggaaaacaat gactcctata atttaactta caccacacgt 
481 tatgagatgt cgtgtgccaa gtcttcatgg aaaatcacaa agggggctgt cctcaaatca 
541 taacagatgt aattctttct caccttttct gtatttaact gttattagat tactcagcag 
601 ttgaatgata tgtttctcca ccatatcgat catgagtgta tttggtgctg cc 
// ^ ■ 



in 
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>gi| 24100065 |gb|BU889000.l|BU889000 P015D07 Populus petioles cDNA library 
Populus tremula cDNA 5 prime. 
Length =4 60 

Score = 152 bits (384), Expect = le-35 

Identities = 87/149 (58%), Positives = 104/149 (69%), Gaps = 2/149 (1%) 
Frame = +1 

Query: 613 KEAS VKI LAAGVAIGL I SL FSQKYFLKS S S S FQR - KDMVS SME SDVAT I GS - VRADD S EA 670 

K + + AGVAIGL++L K F + SF R K+ + S+M SD + S V SE 
Sbjct : 13 KRCQYQNMCAGVAIGLLTLAGLKCFPPRTGSFIRQKEIGSAMASDTINLNSAVDEQISED 192 

Query: 671 LPRMDARTAENIVSKWQKIKSLAFGPDHRIEMLPEVLDGRMLKIWTDRAAETAQLGLVYD 73 0 

LPRMDAR AE+IV KWQ IKS AFG DH + LPEVLD +MLKIWTDRAAE A LG VY+ 
Sbjct: 193 LPRMDARGAED I VRKWQNI KSQAFGTDHCLAKLPEVLDSQMLKIWTDRAAE IAHLGWVYE 372 

Query: 731 YTL L KL S VD S VT V S ADG TRAL VE ATL EES 759 

Y LL L++DSVTVS DG A+VEATL+ES 
Sbjct: 3 73 YMLLDLTIDSVTVSVDGLNAWEATLKES 45? 

LOCUS BU889000 460 bp mRNA linear EST 17 -OCT- 

2002 

DEFINITION P015D07 Populus petioles cDNA library Populus tremula cDNA 5 
prime, 

mRNA sequence. 
ACCESSION BU889000 

VERSION BU889000.1 GI:24100065 

KEYWORDS EST . 
SOURCE Populus tremula 

ORGANISM Populus tremula 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; 
Tracheophyta; 

Spermatophyta; Magnoliophyta ; eudicotyledons ; core eudicots; 

rosids 

; eurosids I; Malpighiales ; Salicaceae; Populus. 
REFERENCE 1 (bases 1 to 460) 

AUTHORS Unneberg,P., Bhalerao, R . R . , Jansson,S. and Sterky,F. 

TITLE The poplar tree transcriptome : Analysis of expressed sequence 

tags 

from multiple libraries 
JOURNAL Unpub lis hed 
COMMENT Contact: BHALERAO RUPALI R. 

Umea Plant Science Center 
Department of Plant Physiology 
University of Umea, 901 87 Umea, Sweden 
Tel: +46 90 786 5279 
Fax: +46 90 786 6676 

Email: rupali.bhalerao@plantphys.umu.se. 
FEATURES Location/Qualifiers 
source 1 . .460 

/organism^ "Populus tremula", 
/mol_type= "mRNA" 
/ db_xr e f = " t axon : 1 1 3 6 3 6 " 
/tissue_type= "petioles " 

/clone_lib=" Populus petioles cDNA library" 
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BASE COUNT 138 a 82 c 117 g 123 t 

ORIGIN 

1 gactgaaaaa ataaaagatg ccagtatcaa aatatgtgtg ctggtgtggc aattggactg 
61 ctgactttag ctggcctgaa gtgttttcct cctaggactg gctccttcat tcgacagaaa 
121 gaaattggtt cggcaatggc atctgacacc atcaatttga attcagcagt agatgaacaa 
181 atttccgagg acttacccag aatggatgca aggggtgcag aggatatagt tcgcaagtgg 
241 caaaacatta aatctcaggc ttttggaact gatcactgcc tggcaaaatt gccagaggtt 
301 ttggatagtc agatgttgaa aatatggaca gatcgtgcgg ccgaaattgc acatcttggt 
361 tgggtatacg agtatatgct gttggacctg actattgaca gtgtgactgt atctgtagat 
421 ggcctaaatg ctgtagtaga agcaacactc aaagagtcaa 

// 



in 
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Chlamydomonas reinhardtii ARC6-like Gene Sequence 

Gene model at http://genomejgi-psf.org/cgi-bin/dispGeneModel.v4?db=chlrel&id^ 

Genomic Sequence [46927:50859] Exons are underlined 

>genie . 2 94 . 6 | Genomic 

ATGAACTCGGCGGAGCACGTCTCTGTTQCCGTGGACTATTACCGAATGCTGCACGTTGCCCGCGTAAGCC 

GCCCTGACGCCATTCGCAAGGCGTATGAGAACCTGGTGAAGCAACCCCCCGCTGCCGCGTACTCTGCGGA 

CACCCTCTTCGCACGCGCGGTGCTACTCAAGGCAGCCGCGGAGTCGCTGACCGACCCGGACCTGCGCCGC 

TCATATGACGCCAAGCTGGCCGCTGGTCACACAGCCCTGCGCGTCAGCCAGCAGGACCTACCCGGAGCCC 

TTGTCGTGCTGCAGGAG GTGAGCCGTGCTCTGGCGACCGCTCAACCCCTTGCGACCGCTAAAACCATCAG 

CACATATAGCACATATAAATTCCCATGGGTTCTGTACTACCGCCCACCCCTCTGAAGGGGGCGAGTATTC 

ATTCTTCACGCATGAGCGCAGACTTTTACCCTATCAAGTCCCGCCCTCGCCCGCCTTCTCTTCCCACAGA 

TCGGCGAGCACCAGTTGGTTCTGGATCTGGGTCTGCGCTGGCTAGAGGTAAACGGCGGCCAGCCCGACGC 

CGGCGACGTGGCCGCTGCCGTGGCCCTGGCCTACTGTGACCGCGCTGGTGAGCGCCTCACCTCCCAGCTG 

CAGCCGCCGCCGGCCTCAGCGCTGCCAGGCCCCGATGGCGCGGCGGTGCCGCACGCGCACGTGGGCGCGG 

TGCTGCCCGCATGCGACGACCTGGACGCAGCGCTGAGCAAGCTCCGGCGGTACGGCATGGCGCAGCAGCT 

GCAGCAGCAGATCGTGGGCGCGCTGCGG GTGAGGCTGGAGCAGGGGCTGGACCGGCAACCGGTCATAGAT 

GTAGACACAGGGATGTAGGCGTCGATGCGAGGGGATGGAAGTATGGGGTCCTGTGAGTGTGAGCCGATGG 

AAGGTATAGATGCTGGGAGCTGGCGCACCCGACCCATGTCATCCAAGGACTTGGCTGATGCATCGCTCAC 

CCCCCGCCTCCAACCCGAATGCCCTCAG GACCTGGCGCCAGAGTACGCGTGCGAGCTGGCCGCCCTGCCG 

CTGGGCGCCGAGACCGCCGCCCGGCGCGCCAAGGGCGTGGCGCTCATGCGCGGTGTGCTGCGCGCCGCCG 

CCACCGTGGCCGCCGCCACAGCCAA GTAGGTGACAAGCACGCAGGAAATCGTGTGCTATATTGCATTGCG 

GTACCTTGCCTTGCATCGCGGAGGCAGTGCTCGAGAATGCGTTTCGTGCGCGTGATCCGTTTGCTCGTCG 

TGCCTTATCCGCCACCCCAG GCCCGAGGCTGCTGCTGACGACAGCGACGACGACGAGGTGGACCCGCGCA .." - . 

GTGTGCTGGCGGCCGCCCGCCGCATGCTGACCCGCAGCCGCGACGTGCTCACCTGCAGCGAGCAG GTACA 

GCGCTGCAACCGGGCAGTTATAGATGGATGCAAGTGCGTGGACGCCGAACGTACAGTTTTTGCTGTGTTC 

CCCGCGTGCACCTTAGCCGCTCCTCCTGCAACCCTCACTTGCGACCTCAATGCGTGCACCTTAGCCGCTC 

CTCCTGCAACCCTCAGTTGCGACCTCACGACACACCGTCTGGCTTACCCCTGCCCCCACCCCAG GTGGCC 

CTGCTGCCGGACGCGCTGCGCGGCAGCGGTGTGTCGCCCACCCCGGACGCGCTGTACGACGGCGCCCTGG 

CGCACCTGGTGGACGGCTTCCGCAACGGCTGGCCGCACTCCGTGCACCAG GTGGGGGAGCGCGGTGCCTG 

GATGTCTGGATGGTCACTGGCCGCAAGGCTGTGCGCACCATCGGGTAGAGTGTAACCAAATGATGTGCGC 

GCAATGAAGGGTGAGCAGATTCCAGCCTCCCTCTGTCGGCTGGCGTCCAACTGTGCCAACTGCGCACACA 

CCTGCGCACGCCCCAG GCCGACCAGCTGCTGGCCAAGCTGGAGGCGCAGCAGGCCCGCGCAGCCGCCATG 

CGCCGCGAGCAGTCCGAGCTGGCCGCCGCCGCCGCAGCCCGCCGTGCCATGTACAGCGGTCCCGCCGCCG 

CCCACGGTCCCACCCTGTACACCAACTACAACAACCCTGCCGGCAGCGGCAATggcgcgccgccgccgcc 

gccccgccc c ATGCCCATGGTGCCCAGGGGCGACGGCCAGCACGCCATGGCGGCGTCTGTGGCGGCGCAT 

GTGCACTCCACGGCGATGGCGGAGCAcgcggcgcgcagcgcggctggcggcgccgccggcgccTCCGATG 

GCGGCGCGCACGCCAACGGCGTGGCTCTAGAGCGGGCCGTGTGCGCCGTCCTGCTGGGTGACTACACCGC 

GGCGGTGGAGCGGCTGGGGCTAGACACGAACGCGGCGGTGGAGCAGGAGCAGCTGCGCGAGTTCGTCCTG 

GTGCGCCGGGGAGGGCCTACTGCAAAACGTGTTGCTCAGGGTCTTGAGATACCGAACACAATGTTTTCGT 

ATACATCTCCCGTCGAGAGAGCTATGCCTCCACCGTCGGCCCGGCTCCACTGCACCCGATGCGGTTGCAG 

GCCCACTCGCCCAACGGCCGCGGCGACCTGCGCCCGGGCCTGAGGGCGCTGGCCACCCGCTGGCTGGAGG 

GCGTGGCGCTGGCGTCCTTCCGCGACACTGCCGGCAGCCCCGTGCCGCCGCTGGAGGCCAGCTGGTTCGC 

GGACCTGCGTGTCGCCTTCTATCTGCAG GTGAGGGGCGGCAGAAGAGAGGGGGGAAAGGGAGGCGAGAAG 

GCGCTTCCGCCGCTGGCGCAACGGGCCATCCTGGTGGAGCACGGCGCTACATCGCATCTGGTCCACCGTC 

TCTGGATGTATAATTCGTGCACTCTTAACCGGCCGCGCAG GTATGGCGGCTGTGCCGCGTGGAGCAGGTG 

CTGGCCGCCGCCCACTTCCTGGCCAACCTGCTGCCCAACATGCTCAAGgccatcgccggcactgccgtca 

aggtcgcagccaacaccgccgtggcagcctcccgcgcgcagcgcctcagcgccaccgtcgcggccagcac 

cgccaccgcctcgtcatcttcctctgccgcccgcggcgctcgtgccggtgccctgagcgctgccaccgcc 

gccgcacacgccgcgcgccgccAGCAGGCGAACGCGGTCGGTGCCAGCATCGTCGGTGCTGACGTGCTGC 

CCCCCACAGCAGTGgccgcggctgccgcggctggcacagcggccgccgccgcagtcaccggcccGgccct 

cggccgtggcgctgcagcttccgcctcttcctttgaggagggcgccgctgaggccgctgacctgcgtcgt 

cgctttgtcgccaccagccgcggcgccagcgcggccgtcgGTGCGCCCACAGCACCAGCCGCTATGACTG 

GGCCCCAGCACGGCGCCGCCTCTGCTGCGCAGTCGCACCGGGAGGAGGATGAGGATTCGCACGGCGGCCA 
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GGAGGGGGGCGTGCCGCGGCGCATGAGCGAGGCGGACCTGCGTGCGCACCTGGCGGGCCTGGAGAAGGCC 
ATGTGGGACTCGGAGCTGCCGCCGCCGCCGCCAtCCCGCGCGCAGAAGGCGCTCACCTACGCCGCAGGAC 
TGGTGAGTTGCTGCGCAGCCTGACGGCCATAGTTGCCGTAGTGCCATAGTGACCGAGCACCGTGATGTTT 
AGGACATGGGCGGAGAAGTGTTAGGACATGAATTGCATCAACGCTGCAAATCTGGTGTATGGTACGCGCG 
TTCCCTGTCACCAACAAGGCTGTTGACCAAGCTGCTGCTGCCCTTGCACTCTTTCAACGCCCGTCTGCAG 
CTGGCCGTGGTGGTGGCCTTCCTGGTGTCCAGCTTCTTccgccgcaacgacggcgccgcctccgccctgg 
cacccgccgccgtcaccaccgcctccgtggccgTTAGCGCGCAGCCCGCCAAGCCGGGCAAGGCCACCCG 
CTCCGCGCACTGA 

Transcript Sequence [46927:50859] (without introns) 

>genie . 2 94 . 6 | Transcript 

ATGAACTCGGCGGAGCACGTCTCTGTTGCCGTGGACTATTACCGAATGCTGCACGTTCCCCGCGTAAGCC 
GCCCTGACGCCATTCGCAAGGCGTATGAGAACCTGGTGAAGCAACCCCCCGCTGCCGCGTACTCTGCGGA 
CACCCTCTTCGCACGCGCGGTGCTACTCAAGGCAGCCGCGGAGTCGCTGACCGACCCGGACCTGCGCCGC 
TCATATGACGCCAAGCTGGCCGCTGGTCACACAGCCCTGCGCGTCAGCCAGCAGGACCTACCCGGAGCCC 
TTGTCGTGCTGCAGGAGATCGGCGAGCACCAGTTGGTTCTGGATCTGGGTCTGCGCTGGCTAGAGGTAAA 
CGGCGGCCAGCCCGACGCCGGCGACGTGGCCGCTGCCGTGGCCCTGGCCTACTGTGACCGCGCTGGTGAG 
CGCCTCACCTCCCAGCTGCAGCCGCCGCCGGCCTCAGCGCTGCCAGGCCCCGATGGCGCGGCGGTGCCGC 
ACGCGCACGTGGGCGCGGTGCTGCCCGCATGCGACGACCTGGACGCAGCGCTGAGCAAGCTCCGGCGGTA 
CGGCATGGCGCAGCAGCTGCAGCAGCAGATCGTGGGCGCGCTGCGGGACCTGGCGCCAGAGTACGCGTGC 
GAGCTGGCCGCCCTGCCGCTGGGCGCCGAGACCGCCGCCCGGCGCGCCAAGGGCGTGGCGCTCATGCGCG 
GTGTGCTGCGCGCCGCCGCCACCGTGGCCGCCGCCACAGCCAAGCCCGAGGCTGCTGCTGACGACAGCGA 
CGACGACGAGGTGGACCCGCGCAGTGTGCTGGCGGCCGCCCGCCGCATGCTGAGCCGCAGCCGCGACGTG 
CTCACCTGCAGCGAGCAGGTGGCCCTGCTGCCGGACGCGCTGCGCGGCAGCGGTGTGTCGCCCACCCCGG 
ACGCGCTGTACGACGGCGCCCTGGCGCACCTGGTGGACGGCTTCCGCAACGGCTGGCCGCACTCCGTGCA 
CCAGGCCGACCAGCTGCTGGCCAAGCTGGAGG 

TCCGAGCTGGCCGCCGCCGCCGCAGCCCGCCGTGCCATGTACAGCGGTCCCGCCGCCGCCCACGGTCCCA 
CCCTGTACACCAACTACAACAACCCTGCCGGCAGCGGCAATggcgcgccgccgccgccgccccgccccAT 
GCCCATGGTGCCCAGGGGCGACGGCCAGCACGCCATGGCGGCGTCTGTGGCGGCGCATGTGCACTCCACG 
GCGATGGCGGAGCAcgcggcgcgcagcgcggctggcggcgccgccggcgccTCCGATGGCGGCGCGCACG 
CCAACGGCGTGGCTCTAGAGCGGGCCGTGTGCGCCGTCCTGCTGGGTGACTACACCGCGGCGGTGGAGCG 
GCTGGGGCTAGACACGAACGCGGCGGTGGAGCAGGAGCAGCTGCGCGAGTTCGTCCTGGCCCACTCGCCC 
AACGGCCGCGGCGACCTGCGCCCGGGCCTGAGGGCGCTGGCCACCCGCTGGCTGGAGGGCGTGGCGCTGG 
CGTCCTTCCGCGACACTGCCGGCAGCCCCGTGCCGCCGCTGGAGGCCAGCTGGTTCGCGGACCTGCGTGT 
CGCCTTCTATCTGCAGGTATGGCGGCTGTGCCGCGTGGAGCAGGTGCTGGCCGCCGCCCACTTCCTGGCC 
AACCTGCTGCCCAACATGCTCAAGgccatcgccggcactgccgtca aggtc gcagccaacaccgccgtgg 
cagcctcccgcgcgcagcgcctcagcgccaccgtcgcggccagcaccgccaccgcctcgtcatcttcctc 
tgccgcccgcggcgctcgtgccggtgccctgagcgctgccaccgccgccgcacacgccgcgcgccgccAG 
CAGGCGAACGCGGTCGGTGGCAGCATCGTCGGTGCTGACGTGCTGCCCCCCACAGCAGTGgccgcggctg 
ccgcggctggcacagcggccgccgccgcagtcaccggccccgccctcggccgtggcgctgcagcttccgc 
ctcttcctttgaggagggcgccgctgaggccgctgacctgcgtcgtcgctttgtcgccaccagccgcggc 
gccagcgcggccgtcgGTGCGCCCACAGCACCAGCCGCTATGACTGGGCCCCAGCACGGCGCCGCCTCTG 
CTGCGCAGTCGCACCGGGAGGAGGATGAGGATTCGCACGGCGGCCAGGAGGGGGGCGTGCCGCGGCGCAT 
GAGCGAGGCGGACCTGCGTGCGCACCTGGCGGGCCTGGAGAAGGCCATGTGGGACTCGGAG'CTGCCGCCG 
CCGCCGCCATCCCGCGCGCAGAAGGCGCTCACCTACGCCGCAGGACTGCTGGCCGTGGTGGTGGCCTTCC 
TGGTGTCCAGCTTCTTccgccgcaacgacggcgccgcctccgccctggcacccgccgccgtcaccaccgc 
ctCcgtggccgTTAGCGCGCAGCCCGCCAAGCCGGGCAAGGCCACCCGCTCCGCGCACTGA 

Protein Sequence 

>genie.294.6 

MNSAEHVSVAVDYYRMLHVPRVSRPDAIRKAYENLVKQPPAAAYSADTLFARAVLLKAAAESLTDPDLRR 
SYDAKLAAGHTALRVSQQDLPGALWLQEIGEHQLVLDLGLRWLEVNGGQPDAGDVAAAVALAYCDRAGE 
RLTSQLQPPPASALPGPDGAAVPHAHVGAVLPACDDLDAALSKLRRYGMAQQLQQQIVGALRDLAPEYAG 
ELAALPLGAETAARRAKGVALMRGVLRAAATVAAATAKPEAAADDSDDDEVDPRSVLAAARRMLTRSRDV 
LTCSEQVALLPDALRGSGVSPTPDALYDGALAHLVDGFRNGWPHSVHQADQLLAKLEAQQARAAAMRREQ 
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SELAAAAAARRAMYSGPAAAHGPTLYTNYNNPAGSGNGAPPPPPRPMPMVPRGDGQHAMAASVAAHVHST 
AMAEHAARSAAGGAAGASDGGAHANGVALERAVCAVLLGDYTAAVERLGLDTNAAVEQEQLREFVLAHSP 
NGRGDLRPGLRALATRWLEGVALAS FRDTAGS P VP PLEAS WFADLRVAFYLQVWRLCRVEQVLAAAHFLA 
NL L PNMLKA I AGTAVKVAANTAVAAS RAQRL S ATVAAS TATAS S S S S AARGARAGALSAATAAAHAARRQ 
Q ANAVGAS I VGAD VL P PTAVAAAAAAGTAAAAAVTG PALGRGAAAS AS S F EE GAAE AAD LRRRF VAT S RG 
ASAAVGAPTAPAAMTGPQHGAASAAQSHREEDEDSHGGQEGGVPRRMSEADLRAHLAGLEKAMWDSELPP 
PP PSRAQKALT YAAGLLAVWAFL VS S F FRRNDGAAS AL APAAVTTAS VAVS AQP AKPGKATRS AH * 
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Thermosynechococcus elongatus BP-1 tlr0758 
Location: 

Init: 782410 Term: 784431 Length (aa) : 673 
Direction: direct 

Gene Products: cell division protein Ftn2 homolog 
DNA sequence : 

>Thermo (Chr) 782410-784431 

GTGCGCATTCCCCTCGATTATTACCAAGTGTTGGGTGTGCCTATTCAGGCAACGCCGGAG 
CAAATTGAGCAGGCCTTTCGGGACCGGCTGTTGCAGCTCCCTACCCATCAGCACTCCCCC 
ACCACAGTTGCCACCCGTCGCGAACTCATTGAGCAGGCCTATGCAGTTTTGCGAGAACCG 
GAGCAGCGCGATGCCTACGATCGCCACTGCCGTACCGTTGATCCCGATGATTTGATTGCC 
CAGTTGGATCCCGATGCCACCACTCCCCACATTGAAATTAGTGATGAGCAATTGTCGGGG 
GCACTCCTACTGCTGTATGAACTAGGAAATTATGCCCAAGTTGTCAACCTGGGAGACGCC 
TTTCTTAAAAAGGATGTTTTTGAGCGCAATCGCCCCTACACTTCCCCTGCCGCCGTTGCC 
GACATTACCCTCACTGTGGCTTTGGCCTATCTGGAATTGGGACGGGAGGAATGGCAGCGG 
CAGTCCTATGAATCAGCCGCCTCTCAGCTAGAAGCCGGTCTCCAGGTACTTCAGCGGGTA 
AATTTGTTTCCCGAGCTCCAGGAGCAGTTTCAGACGGAACTGAATCGGCTGCGTCCCTAC 
CGCATTCTGGAATTACTGGCACTGCCTTTGTCCGATAGTGGGAATCGGCAGCGGGGTATT 
TTATTGCTGCGGCAAATGCTGAGTGAGCGCGGGGGCATTGAGGGGCGCGGTGACGATCGC 
TCAGGACTAACAGTTGAGGATTTTCTGAAATTTATTTTGCAACTGCGCAGCCATCTTACC 
GTGGCAGAACAACAGGAACTCTTTGAACGGGAATCGCGGCGTCCCTCAGCGGTGGCCACC 
TACCTTGCGGTACATGCCTTGGTAGCACGGGGAGTGCATGAACTGCAGCCGAGCTATATT 
TGTCGGGCCAAGGATTTATTGCAGCAGCTGCTCCCCCATCAAGACGTCTATCTTGAACTT 
GCCAGTTGCTTGCTGCTTTTGGGACAGCCCACCGAGGCCTTGGCAGCTCTTGACCACAGC 
CAAGATCAACCGACTCTGGACTTTATCCGCCGTCATGCCGGTGAGGCTGGCGATCGACTG 
CCGGGGCTTTATTACTACACCACACAATGGCTCACGGAGGAAATTTATCCTGCATTTCGG 
GACTTGGGGGAAACACCCGTGGCCTTGGAGGCTTACTTTGCTGATGCCAATGTCCAAACC 
TATCTAGAGGCTCTCAGTGAGGACTCCATTGCCCCTGAACCCCCTGCGACCACTGCCTCT 
GCGCTCCCTGAAGTGATCAGACCAACGGTGGCCGTGCCCCCTCCCCTCTCCTTGACAGCG 
GAAACGTTACCGTTGCAGGATCAGAGTCGGCTGGGTCAGGGCCTTTCGGCATCGGCTTTT 
ACCCCTTCTGCAACTGCAACGGGGACATCGATGCCCCAACCATCGCCTCGCAAACGGCGC 
AGCCCTCGAAACCGTTGCGCCCAAAAACGTCAGACTTGGTTTTGGATGGGTGCAGGAGTG 
GTTCTTGTGGGTTTAGGGGCGTTGGCAAAAGTCTATTGGCCCGCCAAAACCGCTGAAGCC 
CCCCCGCCGCCGGTGACACCGGCACCAACTCCTGTGGCAACGCCGACCCCAACGCCACAA 
CCGACGACCTTAGCCATCACTTTAACACCAGAGATGGCGCGCGATCGCCTCCACACTTGG 
CAGCAAATTAAAGCCCAAGCCCTTGGGCGACCATTTGAGGTGGACAAACTAACAACGATT 
TTGGCGGAGCCAGAACTCAGCCGCTGGCGATCGCGGGCACAGGGCTTAAAGTCCGAGGGC 
AGCTATTGGGTTTATACCCTAAAGAACTTAGAAGTGAAGGAAGTCCGCCTCCAAAGGAGC 
GATCGTGTGGAGGTGTTGGCAGAAGTCAACGAGGATGCCCGTTTCTATGAACAGGGAACC 
CTGCGCACTGATATTTCCTATAGCGATCCCTACCGGGTCATTTATACCTTTATCCGTCGC 
GGCAATCAATGGTTGATTCAAGGCATGCAGGTGGTTAGTTAA 

Protein sequence : 

>tlr0758 {782410 - 784431 direct} cell division protein Ftn2 homolog 

MRIPLDYYQVLGVPIQATPEQIEQAFRDRLLQLPTHQHSPTTVATRRELIEQAYAVLREPEQRDAYDRHCRTVDP 

DDLIAQLDPDATTPHIEISDEQLSGALLLLYELGNYAQVVNLGDAFLKKDVFERNRPYTSPAAVADITLTVALAY 

LELGREEWQRQSYESAASQLEAGLQVLQRWLFPELQEQFQTELNRLRPYRILELLALPLSDSANRQRGILLLRQ 

MLSERGGIEGRGDDRSGLTVEDFLKFILQLRSHLTVAEQQELFERESRRPSAVATYLAVHALVARGVHELQPSYI 

CRAKDLLQQLLPHQDVYLELASCLLLLGQPTEALAALDHSQDQPTLDFIRRHAGEAGDRLPGLYYYTTQWLTEEI 

YPAFRDLGETPVALEAYFADANVQTYLEALSEDSIAPEPPATTASALPEVIRPTVAVPPPLSFTAETLPLQDQSR 

LGQGLSASAFTPSATATGTSMPQPSPRKRRSPRNRCAQKl^QTWFWMGAGVVLVGLGALAKVYWPAKTAEAPPPPV 

TPAPTPVATPTPTPQPTTLAITLTPEMARDRLHTWQQIKAQALGRPFEVDKLTTILAEPELSRWRSRAQGLKSEG 

SYWVYTLKNLEVKEVRLQRSDRVEVLAEVNEDARFYEQGTLRTDISYSDPYRVIYTFIRRGNQWLIQGMQWS 
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Trichodesmium erythraeum 

Contig97 Gene 8639 
Strand = r 

Start Location: 40312 
Stop Location = 37943 
Stop Codon = TAA 
MRNA 

GTGCGGATTCCATTAGATTATTATCGAATTTTAGGTTTACCAATTCAGGCTACTGCTGAACAGTTGCGGCAGGCA 
CATC A 

AGACCGCACTCAGCAGTTTCCTAGAAGGGAGTATTCTGAAGCCACAATAGTTGCTCGTAAACAGCTTATAGATGA 
GGCTT 

ATGCTGTTCTTTGCGATCCTGAACAACGTCAAACCTATGATGGTAACTTTTTAGCTAAAACCTACGAGCCAATAG 
TAGAA 

GAACTCAATCCAAGTTCTCAGATAAATTTTGATCAAGCACAAGAAAAAGAAACCACACTTAAGGAGACTAGAGAA 
GTTCT 

TCCGGAAATAGCTTCTAAACAGTTAAAAAAAAGGACAAGTTATCAAAACAGAGAGACTAAAGCTGCCTCTGATTT 
TCATT 

CTAATACCCCTAGTATAGAAATAGAATATCCACAATTTGTGGGAGCCATCCTAATTTTACATGAGCTAGGAGAAT 
ATGAG 

CTAGTATTAAAAATAACTCACCCTTATCTTCTTAACAATAGTATAACTATTAAAGATGGACGTTTTGGAGACCCA 
GCATT 

AGTTTTGCCAGATGTTGTCCTTACAGTTGCTCTAGCAAATTTAGAATTGGGCAGAGAGGAATGGCAACAAGGACA 
ATACG 

AAAGTGCAGCTACAGCTTTAGAGGCTGGCCTAGGGTTATTGCTACGAGAAAACCTATTTGTCCAAATACGAGGAG 
AGATA 

CAAGCTGACCTTTATAAGCTACGTCCTTATAGAATAATGGAGCTAATAGCACTACCAGAGGAAATAGCTCTAGAC 
CGTAG 

CCGTGGACTAGAAATTCTTCAAGATATGCTCAATGAACGGGGAGGAATTGATGGTCAAGGTGAAGATAGCTCTGG 
ACTTG 

GGATAGAAGATTTTCTAAAGTTTGTTCAGCAGCTACGTCAATACTTAACTACAGCAGAGCAAAAGAAGTTATTTG 
AGGCA 

GAAGCCCTTCGCCCTTCCGCAGTTGGTGCATATCTAGCGGTTTATACTTTTTTAGCTCAAGGGTTTGCTCAAAAA 
CAACC 

AGCCTTTATTCGTAAAGCTAAGTTGATGTTAATGCAATTGGGTCGGAGTCAAGATGTAAATTTAGAGAAATCTGT 
CTGTG 

CTTTACTTTTAGGGCAAACTGAAGAAGCTAGTCGTTCATTAGAACTTAGCCATGAAAATGAACCTCTATCCTTTA 
TTAAA 

GAAAATTCTCAACAATCTCCAGATTTATTGCCAGGTCTATGTCTCTATGCTGAACATTGGTTGACAGAGGAGGTT 
TTTCC 

ACATTTCCGTGATTTGTCTGACAAGTCAGCTTCTTTGAAAGATTATTTTGCAGATCAACATGTTCAAGCTTATCT 
AGAAG 

CTTTACCTACAGAAGCAGAGGTAGCTAATCAATGGGTAGTCGTTCAGCCTCGTCGTAGTAATCACAATAAAAAAC 
AAATG 

TTCGACCCCAAGGAACTTGAGAAGTTGAATGTATCAGATTTGGAGGATAAAGATATTTCTCGGGTAGATGCTACT 
GCTAC 

TGGTATTGTTGCTTCTGGAAGTCAAGGAAGTTCTAATTTACTAGGGGCTAGTTCTGATGGGTTGCTTCAAGAATT 
AGAAA - 

AATCATCATCTACTAGAGGTGGGCCAAAACAAGTAACTACTAAGAGTTCTAGTCACTATTTAGGAAAAATTAGGG 
AAAAG 

AGTATAAGTGGTTTACCTGAGTTTAATGAAAGTACATCTATTGAGAGTGGGGGGTTACCCCAATCTATCCAAGAG 
CATAG 

TTCACGTAGAACTTCTGCTAGAAGAGAACCTGTTAAGTTTGGTCGTTTAATATTAATCGCAATTGTGGGATTTTT 
GTTAA 

TAGGATTTATTGGGTTGTTAACAATTAAAACTATCGGCTGGTTAGTAAATGCTTTAGGATGGGAAAGAGAAAAAC 
TGATG 
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ATACAATTGGATAGGCCTCCTATAGA^TCCCAGAACCTGATCGGGTTAACCTCGCAGCATCAGGACCGATAACA 
AAAGA 

AGTAGCAAGGCGAACAATTCAAAGTTGGTTAGATATCAAGGCTTCTGCTCTTGGTCCTAATCATAAAATTGAACA 
ATTAC 

CAAATATTTTAGTAGAACCGGCACTTTCTCGTTGGTTACCTACAGCTAATGCCCTGAAGCAAGAAAAGTCATACC 
GTAGG 

TATGAGCATGATTTAGAAATAAGTAATATAAAGATGAGTAATACAAATTCTAATCTCGCTCAAGTAGATGCTAAA 
GTGAT 

AGAAAAGGTAGAGTTTTATTCTGACAATGGTAGATTAACTAATACTAACAATGAAAACTTATTTGTTCGTTATGA 
TTTAG 

TTCGTAAAAGTCAAAAATGGCAAATTAGTAATTGGAAGGTATTGAGATAA 



PROTEIN 

Protein Length = 789 

VRIPLDYYRILGLPIQATAEQLRQAHQDRTQQFPRREYSEATIVARKQLIDEAYAVLCDPEQRQTYDGNFLAKTY 
EPIVE 

ELNPSSQINFDQAQEKETTLKETREVLPEIASKQLKKRTSYQNRETKAASDFHSNTPSIEIEYPQFVGAILILHE 
LGEYE 

LVLKITHPYLLNNSITIKDGRFGDPALVLPDWLTVALANLELGREEWQQGQYESAATALEAGLGLLLRENLFVQ 
IRGEI 

QADLYKLRPYRIMELIALPEEIALDRSRGLEILQDMLNERGGIDGQGEDSSGLGIEDFLKFVQQLRQYLTTAEQK 
KLFEA 

EALRPSAVGAYLAVYTFLAQGFAQKQPAFIRKAKLMLMQLGRSQDVNLEKSVCALLLGQTEEASRSLELSHENEP 
LSFIK 

ENSQQSPDLLPGLCLYAEHWLTEEVFPHFRDLSDKSASLKDYFADQHVQAYLEALPTEAEVANQWVWQPRRSNH 
NKKQM 

FDPKELEKLNVSDLEDKDISRVDATATGIVASGSQGSSNLLGASSDGLLQELEKSSSTRGGPKQVTTKSSSHYLG 
KIREK 

SISGLPEFNESTSIESGGLPQSIQEHSSRRTSARREPVKFGRLILIAIVGFLLIGFIGLLTIKTIGWLVNALGWE 
REKLM 

IQLDRPPIEIPEPDRVNLAASGPITKEVARRTIQSWLDIKASALGPNHKIEQLPNILVEPALSRWLPTANALKQE 
KSYRR 

YEHDLEISNIKMSNTNSNLAQVDAKVIEKVEFYSDNGRLTNTNNENLFVRYDLVRKSQKWQISNWKVLR 
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SEQ ID NO:ll 

56041 actgtaaatt ttgataaata aaaaaaaaca aaaaaaagat cgccaaatca tatttcatac 
56101 tatcagattt aaacaatata atttgttcga cgatacagaa atattttacc tcacaggaag 
56161 aggttgcgca gaaggagcca tggatgtgtt tgttcgagtc gagttgcttt gttgtaagta 
56221 ggtaattgca agaaacttga gttgtctata aagctttgga atacttctct ttatatatac 
56281 gtttacaaca attttttttt tttttttttt tctattttta caacaaattg ttttttatta 
56341 taataataaa cttaaacgaa aataaataat atctctttgt tctatttctt aaaaaagaaa 
56401 ttagcttgta gtacttcaac gtatcttaac tctttagtct ttagtaggta tatatcatct 
56461 atttatttat ttttattttt tttatattac gattatagtg tacgtacgta tttattaatc 
56521 aaaaataact tggtagaagt aaaaagaaaa tgattttttt tttactcagt gatcagtttt 
56581 acgtttattc aaaaataagt tgtagtttcc ttcttaatat tcaagttata tgactaaaaa 
56641 ttggtcggtt aatttactat taagattaat cggaaactct agttagatca cgagataatc 
56701 atcacgtgga gaaacatttg gttcttgtca cgtggagaaa acgttaagct tattttttac 
56761 ttctttatta tatttttgag gaaatggttg aaagaaagag agtgtttaaa atgtgaatgc 
56821 gctcgtagtt aggtggaggt taatgggtag gagggtaggt catatgtgta ttagtgatgg 
56881 ataaaaatta aaaacataaa aaaaacttca agctgtaaat aatctaataa aagaacatag 
56941 aaatataatc aaagaaccat ttaactaaat aaatactttc gattcaaata gcatatttct 
57001 aagttccaag aatagctatc ctctatccac atgttacatt ttttttttct ttttcacatc 
57061 catatagttt ttaaaataat tttctagatg gtatttttta ttcgacattt ttttttcctt 
57121 ttagatttac tgattataat ttatttagaa ataaatgata cgactgtcgt ttctacaaaa 
57181 ctgaaatttg caaacattgg accaaaaagc gaaaccttaa tcacttgaaa cgacaacgtt 
57241 ctttagtatg tttttggaca tacaaagtac acataagatg ttccctcact cttcgattgt 
57301 ttcttaacct aatataatta agcaatattg aacttgagtc actcaatgct gcaccgaagg 
57361 agcctttaga ttttgagcaa attcatgaga gtttagcttc tcattcatca ctctgaattt 
57421 ctcttttatc ctctttatct gtccaaaaca tgacacataa cataatgtta gttctcctgc 
57481 atacttccaa tggcaaatag aaaaaagaga cattgatcat agaagtcagt ttggtttacc 
57541 cttctgagct cgatctctgt gctccgtttc ttttgatcaa gtgattgccg gagattcgtg 
57601 atgtcgaaga tactatcgag gtcgtcttca aatgcgtttt ccaactcttc ccggagaaga 
57661 gcaggtaact tatcaacgat gggcattaga agaaaacagt tgaactgcag aacaaaagaa 
57721 aacacagata caaacttttt aaaagaaaag tcattttaaa agcaagaaga atctgagtaa 
57781 aaactgaagt aggagcaaac ctttaactca gcagaggcga gaaagtactc tcgtatgccc 
57841 tggaatatct gttggaccaa tgcgtacaca attctctcag aggaaggagc aagcttgcgg 
57901 ttccaaagtg tgctatctag aagatcagcc aaccgcattt ctgttgtctg aatactggaa 
57961 cctgaatcga tgtttgaggc gagatggctt agctttacat ctgatcttga cttggtgtct 
58021 gttgtgccac ctaatgcatc ttggggaaga ctaaatccta tggcattacc tgatgtcgta 
58081 ttatgctctg ttccaccaaa tgagtccaag aattgacgta gaccagctcg gttctacata 
58141 acattgagaa acgaaaacta ctcaatcaga aacggatact tgatggtatg tacacaactc 
58201 aattggattg aaacagagct atagggctgt agcaatgacc ttgttgtgaa gagaccatgt 
58261 aacatagcga gttgtacttg ctaaatcctc catacatctg caaacaatat aaaatccaaa 
58321 gggtgatcaa tcactaaagc tcactagaac acaggtagga ggcaccgaca tggtaagaac 
58381 aggaattgga aatagaatta cttgtcacga catgattttt ctgtggactc cacaaaactg 
58441 ttgaatgctg aagcaacccg cttgagaaac acctcatgcc cacttaaata ttcaccttct 
58501 ttctattcaa atttagaaca tacatcaaaa aatttgctgg aaagggatca tgagtatgat 
5 8561 accgtcaaac caaagaaaac agtacctacc tgaagaagat atacagaaat tggaagcaat 
58621 ctcttgagaa tgtgtagaag cctcgcccct aactatatca acgcaaaaca aacgaaaatg 
58681 agaactggaa aaaactttct gtatggaaag agaaacatgt gaataacaaa atttcagatg 
58741 aaagtattcc caaacatagt ttctgtaagc agaacatgtt tactcgataa ctcttatgca 
58801 caaataagtt ccagcaaatc tcaaaactga atggtagtat gatttcaata tataacgtta 
58861 tatttcattt ttttttttac gtacagtaca ccttaactaa ttagtaaaat tgctttccat 
58921 cctccacgaa agaaaaagaa aaaagtagct atatctatgt cacctgatga aggaaaggtt 
58981 caaacgtctc acgagcctte gcaactgcta taacacaagc tgttctacaa cagcaaataa 
5 9041 gagaaagaga ataagaggcc atagaaaaca tgacaaacgt tgcagctcag attagatact 
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59101 gaaaggggtc tgggatgcaa agacaataaa ttgagaagtg tgttgcatgt cagtcaatcc 
59161 tatgatacct ggaatagttt gttccatcat gaatatcctc aactccacat gcatttacaa 
59221 tttcctccct cgttattggg ggacatttga tagcaccaac tagaaaacga aactcagcca 
59281 tggcacggtg atattgtgca cccccataga gacgcatccc tgcattctgt aaaatgaaag 
59341 ataatctggt tatggtctct cataattctt gaaggtccaa cgaagtatct cttttatttg 
59401 tttccaatac attattcttt ggcacatatg tttcatgcgg tcaaatttat cttccatcat 
59461 attataatcc atgtacaaga acaagacaac tggatttgaa gaccatgccc agcttgctct 
59521 ataaagtcca acaatattct gcttcaggga aagacttacc ggtattagct tatgtgaaaa 
59581 ctggagacca tcagtaccaa caaatgctcc tccttgtgtc ctttcatctt gcagtgtctc 
59641 acctgaaaaa caccatgaga aattattaac aatcaaagaa cccaacataa agagaatgct 
59701 gttataaaat gtgcttctgc cagtaaccaa agtatcatga ccaatgattg attgattagc 
59761 atacatcatt ccatgtgtaa tcatcgcagt ctggtgaccc agtcgaattg aacaatatgc 
59821 atttaactaa actgattttg caaaagtcca atttaacaac acccagaaac aagaaaagtt 
59881 tatgccaaag aagttgacta gcagagaaca gagcagtaac attaccaaat ttatctggag 
59941 gggccacaac tgttcccttc aataacagcg ataactgatc aagaaaaata taaacaaaae 
60001 aggtgagaaa acacagcact gatcaatact aacaaaggta cttcgtacgt caatcagaaa 
60061 atatgacgca gcaattttaa agtcttaagg gcatccaaca caaaaagttt acagccattc 
60121 tgaatttgta gcaagtccta gatatcattt actgtagcat aattttatat gtgtcagtaa 
60181 tcaataaaca aatttgtttt tatgtgtcag tagttaataa accaaaaaaa aagagaagtt 
60241 tacacaaatg aacttgttgt aattatacaa aaactattaa tccacgagtc caggcaaaaa 
603 01 tgaaaaggta tgggaaggtg taaatagaaa tctaaaaaaa cgaaatgctc tctacagtta 
60361 ccttggttaa gaagagatca tggaaagtcc tgcctctctc tttgagtttt gcttcatcca 
60421 aagagctgca ttgaaaggaa ttattcaacc tccaatgagt tatattttct ataaatcagt 
60481 agctaacaat taaactgcct aaaatcaagt agacattttc agacaaaaca aattgcgacc 
60541 taagttcctt gctcacggta tccagcttt'c tgactgtact gcggtactcc tttcctaaca 
60601 gtggaatgat caatggaaca ctctctttgt acctggaaag agaagggcat caagactaca 
60661 gcgaaaagta aactacaata gaaacagagg ctggaaaaat cagagttaaa acaacagtta 
60721 taccttttcc agagtagttc ttccagaaac aacctcagtt tactgatgcc aatcctactc 
60781 ttttcctgtt ttgtcagtaa acggcccaac ttcttctcta aagatgcaat gtcttccatt 
60841 tctctaagtg acacagcctg taataaaaac cacacatagt ttagaaaaag acctgtttaa 
60901 cttgtttaag gaatcagaca gcagagcaga gacctgtttg aactcgtcat tagacttata 
60961 cactgaatcc tgtccatagc. caactcttcc agaaggcaca gacgtgaaaa aaggagaatc 
61021 gcccaataag gagctgtcaa gtgcgcttgc aggaggtgag agaaagactt ccacgtcaga 
61081 tgaacatgag aattgaggga ttttagtgtc aagctttgta gaaacaacaa ttgtcctaga 
61141 aagctcagga tcaacctaca tgaacgagaa acaaacttta acaaaaataa agacaaggtt 
61201 agacgcaatg gagttacgtc aagcaacgta cttgcatcac tatccttcga gtggttgcaa 
61261 tgctccagtc actgctatct tcgaggcata aaatgatgaa ctctttgtgt tgcatctttg 
61321 ctcggactag agcttccaca gcccgtgctt gaacctaaga aaaagaacaa gtaacccact 
61381 ctcaaataaa gcaaaaccaa aacatgaaat cagccacgga attggctgga agccataaga 
61441 aaaaacaacc tgaagagctc iggtttttcag tcctggtgca ggagcaataa gtccaggtgt 
61501 atcaatgatg gtaaggtttg gacaatactt atactggact ttcacaataa tctcctttgc 
61561 agagaatggg ctacatggct cttgctccag cctcatgttc tcagcctcaa tatatgccta 
61621 actccaaatc atataacaaa tttcgttaac atgagcattt cgcttctcta caataaacct 
61681 aagtacttgt gtttctcaac attcgtcaaa atcttcccag aatttatacg cagaaacaag 
61741 caattgaaga agcacaagta ataataataa caaaacacct gaatttgtga gagagatttg 
61801 ggaagagaaa cggaaggatc atcatcagat ccgagatgac aaagcgggaa ttgacactga 
61861 ggatcgtact tcatatggag agtaatcggc cgacgagtct tggttccgcc gccgacatgg 
61921 ttaaattgaa accccataag agcttccaca agcgcacttt taccgtcggt ctgctgtccc 
61981 accacaagaa ccgccggtgc ttcgaacggc gtctccaatt cctgcgccaa agcgtgtaac 
62041 tcgttgtaag cttcgtaaag actccaccgc tcctcaatcg cagcgtcgtc ctcttccgcc 
62101 atttcctcaa ccgtcaccga ttttgctgat acttccgcca tcgtctctta cgaaaatgag 
62161 caagaggaag agtaagagta agagagtgtc tcttatttct tctactcttt agttttcgtc 
62221 gccgttcctt tttccgccat ggaattagca gatacggcta atttcaattt ttgtcaaaag 
62281 aaatattttt tgtgttttaa tctcacgcgc atccatggcg cgttgagtca acgttgtaat 
62341 agttctccgc taaatttaaa taaaagagcg cgtaaggaga gagtttaagg attttttttt 
62401 tttggtcggc aaatacaaag gatttgcttt gtcttgacca atagtatatg cagaaatatt 
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62461 atctcaaagg atttgtgata actatgtagt acagaattgt gattattgga tgagaaacca 

62521 gaaatatttt gagcaaatga cgacttgtta atttactatt ttttcatttc ttaaaggtct 

62581 ctcttgtgta aetatgatta aaattgaaat agtgactttt attgttacga catggaacaa 

62 641 atcaacgagt tctattgtta aagagagaca ttgatgaatg taacaaaact gtggcttaga 

62701 agccgaaagg agacttagtt cgggtccctc cttcaccgta ttgctcgttc cattttctca 

62761 attcgttcat tgtcgtcgcg tcgtatgcca ctgacggact tacctgcaaa ttacattaca 

62 821 atgacgcaat ttcgataatg caaacaccag gggaaaaaac atgaatagag atgatgatga 
62881 tgttttttaa gagattgatc aataccttag ctttggattg aatgaagtcg tccaaactca 
62941 gtggtcgtag atcaggggac gcatttgtta ccgagtcctg ataattcgac gtttcaaaag 
63001 catggagtga gtacaaaaat tatttttcgt aacaacagaa atcaactgtg tgggtttatg 
63061 catgtcctta ccttgttttc ttcttgtaac aattcttgaa caggtctgta tgcagctgct 
63121 atgcatagat tctgcaatgt aagaaaagaa aaggaatcag aactactgtg ttgaatcata 
63181 ctcgaacttg taaatgaaac cccgaatgac caaaccttta gatcgcttcc tgaatatcct 
63241 tcggtttcct ttgcaagttt atcaaactcg aaaccagttt caagattttc tggtgtcaga 
63301 aatatcttca atatcttcaa ccggttttcc gcatctggta aatccacata tatcctataa 
63361 acacaagcct caatacaatt atcgaaaaga tacaaatatt ccaaaggaga aattacttga 
63421 aagcttaaat taccgtcttg gtagcctacg aatgacagcg tcatcaagat caaaaggtcg 
63481 gttggtggca ccgagaatga gaatcctttg gctatctttt gatctgagtc catcccaagc 
63541 tgccataaac tcatttctca ttcttcgtgt tgcctcgtgc tcaaaagcac caccacgagc 
63601 acccaacaaa ctgtcaacct atacgacaac aaaataaatt acagttagtc cttgagtaac 
63661 acattttacg catcacaaaa gtattcctca taaaaagcaa taaccgaaat tgaaaagtga 
63721 tataaagcta aacaatttct cacctcatca acaaatataa tgacgggggc tagtttgctt 
63781 gcaaaagaga acaaagcctt cgtgagcttc tctgcatctc caaaccactg tgccaaacaa 

63 841 tggacgaaat tgacttaaat cagaaccaat cagaggtaaa gttggaaaga gatttactct 
63901 aagttacaat cggcattgac aataataagt cgatgaccgg ggtggaaaag tttttcttat 
63 961 gtcattagat attctcctta tttatatgaa gatgtttaca aagtggaata tcaacgtgac 
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SEQ ID NO:12 



gaaa ttagccgtat ctgctaattc catggcggaa aaaggaacgg 
cgacgaaaac taaagagtag aagaaataag agacactctc ttactcttac 
tcttcctctt . gctcattttc gtaagagacg 

ATGGCGGAAGTATCAGCAAAATCGGTGACGGTTGAGGAAATGGCGGAAGA 

GGACGACGCTGCGATTGAGGAGCGGTGGAGTCTTTACGAAGCTTACAACG 

AGTTACACGCTTTGGCGCAGGAATTGGAGACGCCGTTCGAAGCACCGGCG 

GTTCTTGTGGTGGGACAGCAGACCGACGGTAAAAGTGCGCTTGTGGAAGC 

TCTTATGGGGTTTCAATTTAACCATGTCGGCGGCGGAACCAAGACTCGTC 

GGCCGATTACTCTCCATATGAAGTACGATCCTCAgTGTCAATTCCCGCTT 

tGTCATCTCGGATCTGATGATGATCCTTCCGTTTCTCTTCCCAAATCTCT 

CTCACAAATTCACGCATATATTGAGGCTGAgAACATGAGGCTGGAGCAAG 

AGCCATGTaGCCCATTCTCTGCAAAGGAGATTATTGTGAAAGTCCAGTAT 

AAGTATTGTCCAAACCTTACCATCATTGATACACCTGGACTTATTGCTCC 

TGCACCAGGACTGAAAAACCGAGCTCTTCAGGTTCAAGCACGGGCTGTGG 

AAGCTCTAGTCCGAGCAAAGATGCAACACAAAGAGTTCATCATTTTATGC 

CTCGAAGATAGCAGTGACTGGAGCATTGCAACCACTCGAAGGATAGTGAT 

GCAAGTTGATCCTGAGCTTTCTAGGACAATTGTTGTTTCTACAAAGCTTG 

ACACTAAAATCCCTCAATTCTCATGTTCATCTGACGTGGAAGTCTTTCTC 

TCACCTCCTGCAAGCGCACTTGACAGCTCCTTATTGGGCGATTCTCCTTT 

TTTCACGTCTGTGCCTTCTGGAAGAGTTGGCTATGGACAGGATTCAGTGT 

ATAAGTCTAATGACGAGTTCAAACAGGCTGTGTCACTTAGAGAAATGGAA 

GACATTGCATCTTTAGAGAAGAAGTTGGGCCGTTTACTGACAAAACAGGA 

AAAGAGTAGGATTGGCATCAGTAAACTGAGGTTGTTTCTGGAAGAACTAC 

TCTGGAAAAGGTACAAAGAGAGTGTTCCATTGATCATTCCACTGTTAGGA 

AAGGAGTACCGCAGTACAGTCAGAAAGCTGGATACGGTGAGCAAGGAACT 

TAGCTCTTTGGATGAAGCAAAACTCAAAGAGAGAGGCAGGACTTTCCATG 

ATCTCTTCTTAACCAAGTTATCGCTGTTATTGAAGGGAACAGTTGTGGCC 

CCTCCAGATAAATTTGGTGAGACACTGCAAGATGAAAGGACACAAGGAGG 

AGCATTTGTTGGTACTGATGGTCTCCAGTTTTCACATAAGCTAATACaGA 

ATGCAGGGATGCGTCTCTATGGGGGTGCACAATATCACCGTGCCATGGC 

TGAGTTTCGTTTTCTAGTTGGTGCTATCAAATGTCCCCCAATAACGAGGG 

AGGAAATTGTAAATGCATGTGGAGTTGAGGATATTCATGATGGAACAAA 

CTATTCCAGAACAGCTTGTGTTATAGCAGTTGCGAAGGcTCGTgAGACGT 

TTGAACcTTTCCTTCATCAGTTAGGGGCGAGGCTTCTACACATTCTCAAG 

AGATTGcTTCCAATTTCTGTATATCTTCTTCAGAAAGAAGGTGAATATTT 

AAGTGGGCATGAGGTGTTTCTCAAGCGGGTTGCTTCAGCATTCAACAGTT 

TTGTGGAGTCCACAGAAAAATCATGTCGTGACAAATGTATGGAGGATTTA 

GCAAGTACAACTCGCTATGTTACATGGTCTCTTCACAACAAGAACCGAGC 

TGGTCTACGTCAATTCTTGGAcTCATTTGGTGGAACAGAGCATAATACG 

ACATCAGGTAATGCCATAgGATTTAGTCTTCCCCAAGATGCATTAGGTGG 
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CACAACAGACACCAAGTCAAGATCAGATGTAAAGCTAAGCCAT 

CTCGCCTCAAACATCGATTCAGGTTCCAGTATTCAGACAACAGAAATGCG 

GTTGGGTGATCTTCTAGATAGCACACTTTGGAACCGCAAGCTTGCTCCTT 

CCTCTGAGAGAATTGTGTACGCATTGGTCCAACAGATATTCCAGGGCATA 

CGAGAGTACTTTCTCGCCTCTGCTGAGTTAAAGTTCAACTGTTTTCTTCT 

AATGCCCATCGTTGATAAGTTACCTGCTCTTCTCCGGGAAGAGTTGGAAA 

ACGCATTTGAAGACGACCTCGATAGTATCTTCGACATCACGAATCTCCGG 

CAATCACTTGATCAAAAGAAACGGAGCACAGAGATCGAGCTCAGAAGGgT 

AAAGAGGATAAAAGAGAAATTCAGAGTGATGAATGAGAAGCTAAACTCTC 

ATGAATTTGCTCAAAATCTAAAGGCTCCTTCGGTGCAGCAlfB| 

gtgact 

caagttcaatattgcttaattatattaggttaagaaacaatcgaagagtg 
agggaacatcttatgtgtactttgtatgtccaaaaacatactaaagaacg 
ttgtcgtttcaagtgattaaggtttcgctttttggtccaatgtttgcaaa 
tttcagttttgtagaaacgacagtcgtatcatttatttctaaataaatta 
taatcagtaaatct 
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SEQ ID NO: 13 

MAEVSAKSVTVEEMAEEDDAAIEERWSLYEAYNELHALAQELETPFEAPAVLWGQQTDGKSALVEALMG 
FQFNHVGGGTKTRRPITLHMKYDPQCQFPLCHLGSDDDPSVSLPKSLSQIHAYIEAENMRLEQEPCSPFS 
AKEI I VKVQYKYCPNLTI IDTPGLIAPAPGLKNRALQVQARAVEALVRAKMQHKEFI ILCLEDSSDWS I A 
TTRRIVMQVDPELSRTIWSTKLDTKIPQFSCSSDVEVFLSPPASALDSSLLGDSPFFTSVPSGRVGYGQ 
DSVYKSNDEFKQAVSLREMEDIASLEKKLGRLLTKQEKSRIGISKLRLFLEELLWKRYKESVPLIIPLLG 
KEYRSTVRKLDTVSKELSSLDEAKLKERGRTFHDLFLTKLSLLLKGTWAPPDKFGETLQDERTQGGAFV 
GTDGLQFSHKL I QNAGMRL YGGAQ YHRAMAE FRFLVGAI KCPPI TREE I VNACGVED IHDGTNYS RTACV 
IAVAKARETFEPFLHQLGARLLHILKRLLPISVYLLQKEGEYLSGHEVFLKRVASAFNSFVESTEKSCRD 
KCMEDLASTTRYVTWSLHNKNRAGLRQFLDSFGGTEHNTTSGNAIGFSLPQDALGGTTDTKSRSDVKLSH 
LASNIDSGSSIQTTEMRLADLLDSTLWNRKLAPSSERIVYALVQQIFQGIREYFLASAELKFNCFLLMPI 
VDKLPALLREELENAFEDDLDSIFDITNLRQSLDQKKRSTEIELRRVKRIKEKFRVMNEKLNSHEFAQNL 
KAPSVQH 



Fig. 12 



SEQ ID NO:14 

56041 actgtaaatt ttgataaata aaaaaaaaca aaaaaaagat cgccaaatca tatttcatac 

56101 tatcagattt aaacaatata atttgttcga cgatacagaa atattttacc tcacaggaag 

56161 aggttgcgca gaaggagcca tggatgtgtt tgttcgagtc gagttgcttt gttgtaagta 

56221 ggtaattgca agaaacttga gttgtctata aagctttgga atacttctct ttatatatac 

56281 gtttacaaca attttttttt tttttttttt tctattttta caacaaattg ttttttatta 

56341 taataataaa cttaaacgaa aataaataat atctctttgt tctatttctt aaaaaagaaa 

56401 ttagcttgta gtacttcaac gtatcttaac tctttagtct ttagtaggta tatatcatct 

56461 atttatttat ttttattttt tttatattac gattatagtg tacgtacgta tttattaatc 

56521 aaaaataact tggtagaagt aaaaagaaaa tgattttttt tttactcagt gatcagtttt 

56581 acgtttattc aaaaataagt tgtagtttcc ttcttaatat tcaagttata tgactaaaaa 

56641 ttggtcggtt aatttactat taagattaat cggaaactct agttagatca cgagataatc 

56701 atcacgtgga gaaacatttg gttcttgtca cgtggagaaa acgttaagct tattttttac 

56761 ttctttatta tatttttgag gaaatggttg aaagaaagag agtgtttaaa atgtgaatgc 

56821 gctcgtagtt aggtggaggt taatgggtag gagggtaggt catatgtgta ttagtgatgg 

56881 ataaaaatta aaaacataaa aaaaacttca agctgtaaat aatctaataa aagaacatag 

56941 aaatataatc aaagaaccat ttaactaaat aaatactttc gattcaaata gcatatttct 

57001 aagttccaag aatagctatc ctctatccac atgttacatt ttttttttct ttttcacatc 

57061 catatagttt ttaaaataat tttctagatg gtatttttta ttcgacattt ttttttcctt 

57121 ttagatttac tgattataat ttatttagaa ataaatgata cgactgtcgt ttctacaaaa 

57181 ctgaaatttg caaacattgg accaaaaagc gaaaccttaa tcacttgaaa cgacaacgtt 

57241 ctttagtatg tttttggaca tacaaagtac acataagatg ttccctcact cttcgattgt 

57301 ttcttaacct aatataatta agcaatattg aacttgagtc actcaatgct gcaccgaagg 

57361 agcctttaga ttttgagcaa attcatgaga gtttagcttc tcattcatca ctctgaattt 

57421 ctcttttatc ctctttatct gtccaaaaca tgacacataa cataatgtta gttctcctgc 

57481 atacttccaa tggcaaatag aaaaaagaga cattgatcat agaagtcagt ttggtttacc 

57541 cttctgagct cgatctctgt gctccgtttc ttttgatcaa gtgattgccg gagattcgtg 

57601 atgtcgaaga tactatcgag gtcgtcttca aatgcgtttt ccaactcttc ccggagaaga 

57661 gcaggtaact tatcaacgat gggcattaga agaaaacagt tgaactgcag aacaaaagaa 

57721 aacacagata caaacttttt aaaagaaaag tcattttaaa agcaagaaga atctgagtaa 

57781 aaactgaagt aggagcaaac ctttaactca gcagaggcga gaaagtactc tcgtatgccc 

57841 tggaatatct gttggaccaa tgcgtacaca attctctcag aggaaggagc aagcttgcgg 

57901 ttccaaagtg tgctatctag aagatcagcc aaccgcattt ctgttgtctg aatactggaa 

57961 cctgaatcga tgtttgaggc gagatggctt agctttacat ctgatcttga cttggtgtct 

58021 gttgtgccac ctaatgcatc ttggggaaga ctaaatccta tggcattacc tgatgtcgta 

58081 ttatgctctg ttccaccaaa tgagtccaag aattgacgta gaccagctcg gttctacata 

58141 acattgagaa acgaaaacta ctcaatcaga aacggatact tgatggtatg tacacaactc 

582 01 aattggattg aaacagagct atagggctgt agcaatgacc ttgttgtgaa gagaccatgt 

58261 aacatagcga gttgtacttg ctaaatcctc catacatctg caaacaatat aaaatccaaa 

58321 gggtgatcaa tcactaaagc tcactagaac acaggtagga ggcaccgaca tggtaagaac 

58381 aggaattgga aatagaatta cttgtcacga catgattttt ctgtggactc cacaaaactg 

58441 ttgaatgctg aagcaacccg cttgagaaac aectcatgcc cacttaaata ttcaccttct 

58501 ttctattcaa atttagaaca tacatcaaaa aatttgctgg aaagggatca tgagtatgat 

58561 accgtcaaac caaagaaaac agtacctacc tgaagaagat atacagaaat tggaagcaat 

58621 ctcttgagaa tgtgtagaag cctcgcccct aactatatca acgcaaaaca aacgaaaatg 

58681 agaactggaa aaaactttct gtatggaaag agaaacatgt gaataacaaa atttcagatg 

58741 aaagtattcc caaacatagt ttctgtaagc agaacatgtt tactcgataa ctcttatgca 

58801 caaataagtt ccagcaaatc tcaaaactga atggtagtat gatttcaata tataacgtta 

58861 tatttcattt ttttttttac gtacagtaca ccttaactaa ttagtaaaat tgctttccat 

58921 cctccacgaa agaaaaagaa aaaagtagct atatctatgt cacctgatga aggaaaggtt 

58981 caaacgtctc acgagccttc gcaactgcta taacacaagc tgttctacaa cagcaaataa 

5 9041 gagaaagaga ataagaggcc atagaaaaca tgacaaacgt tgcagctcag attagatact 



Fig. 12, continued 2/3 



59101 gaaaggggtc tgggatgcaa agacaataaa ttgagaagtg tgttgcatgt cagtcaatcc 

59161 tatgatacct ggaatagttt gttccatcat gaatatcctc aactccacat gcatttacaa 

59221 tttcctccct cgttattggg ggacatttga tagcaccaac tagaaaacga aactcagcca 

59281 tggcacggtg atattgtgca cccccataga gacgcatccc tgcattctgt aaaatgaaag 

59341 ataatctggt tatggtctct cataattctt gaaggtccaa cgaagtatct cttttatttg 

59401 tttccaatac attattcttt ggcacatatg tttcatgcgg tcaaatttat cttccatcat 

59461 attataatcc atgtacaaga acaagacaac tggatttgaa gaccatgccc agcttgctct 

59521 ataaagtcca acaatattct gcttcaggga aagacttacc ggtattagct tatgtgaaaa 

59581 ctggagacca tcagtaccaa caaatgctcc tccttgtgtc ctttcatctt gcagtgtctc 

59641 acctgaaaaa caccatgaga aattattaac aatcaaagaa cccaacataa agagaatgct 

59701 gttataaaat gtgcttctgc cagtaaccaa agtatcatga ccaatgattg attgattagc 

59761 atacatcatt ccatgtgtaa tcatcgcagt ctggtgaccc agtcgaattg aacaatatgc 

59821 atttaactaa actgattttg caaaagtcca atttaacaac acccagaaac aagaaaagtt 

59881 tatgccaaag aagttgacta gcagagaaca gagcagtaac attaccaaat ttatctggag 

59941 gggccacaac tgttcccttc aataacagcg ataactgatc aagaaaaata taaacaaaac 

60001 aggtgagaaa acacagcact gatcaatact aacaaaggta cttcgtacgt caatcagaaa 

60061 atatgacgca gcaattttaa agtcttaagg gcatccaaca caaaaagttt acagccattc 

60121 tgaatttgta gcaagtccta gatatcattt actgtagcat aattttatat gtgtcagtaa 

60181 tcaataaaca aatttgtttt tatgtgtcag tagttaataa accaaaaaaa aagagaagtt 

60241 tacacaaatg aacttgttgt aattatacaa aaactattaa tccacgagtc caggcaaaaa 

603 01 tgaaaaggta tgggaaggtg taaatagaaa tctaaaaaaa cgaaatgctc tctacagtta 

60361 ccttggttaa gaagagatca tggaaagtcc tgcctctctc tttgagtttt gcttcatcca 

60421 aagagctgca ttgaaaggaa ttattcaacc tccaatgagt tatattttct ataaatcagt 

60481 agctaacaat taaactgcct aaaatcaagt agacattttc agacaaaaca aattgcgacc 

60541 taagttcctt gctcacggta tccagctttc tgactgtact gcggtactcc tttcctaaca 

60601 gtggaatgat caatggaaca ctctctttgt acctggaaag agaagggcat caagactaca 

60661 gcgaaaagta aactacaata gaaacagagg ctggaaaaat cagagttaaa acaacagtta 

60721 taccttttcc agagtagttc ttccagaaac aacctcagtt tactgatgcc aatcctactc 

60781 ttttcctgtt ttgtcagtaa acggcccaac ttcttctcta aagatgcaat gtcttccatt 

60841 tctctaagtg acacagcctg taataaaaac cacacatagt ttagaaaaag acctgtttaa 

60901 cttgtttaag gaatcagaca gcagagcaga gacctgtttg aactcgtcat tagacttata 

60961 cactgaatcc tgtccatagc caactcttcc agaaggcaca gacgtgaaaa aaggagaatc 

61021 gcccaataag gagctgtcaa gtgcgcttgc aggaggtgag agaaagactt ccacgtcaga 

61081 tgaacatgag aattgaggga ttttagtgtc aagctttgta gaaacaacaa ttgtcctaga 

61141 aagctcagga tcaacctaca tgaacgagaa acaaacttta acaaaaataa agacaaggtt 

61201 agacgcaatg gagttacgtc aagcaacgta cttgcatcac tatccttcga gtggttgcaa 

61261 tgctecagtc actgctatct tcgaggcata aaatgatgaa ctctttgtgt tgcatctttg 

61321 ctcggactag agcttccaca gcccgtgctt gaacctaaga aaaagaacaa gtaacccact 

613 81 ctcaaataaa gcaaaaccaa aacatgaaat cagccacgga attggctgga agccataaga 

61441 aaaaacaacc tgaagagctc ggtttttcag tcctggtgca ggagcaataa gtccaggtgt 

61501 atcaatgatg gtaaggtttg gacaatactt atactggact ttcacaataa tctcctttgc 

61561 agagaatggg ctacatggct cttgctccag cctcatgttc tcagcctcaa tatatgccta 

61621 actccaaatc atataacaaa tttcgttaac atgagcattt cgcttctcta caataaacct 

61681 aagtacttgt gtttctcaac attcgtcaaa atcttcccag aatttatacg cagaaacaag 

61741 caattgaaga agcacaagta ataataataa caaaacacct gaatttgtga gagagatttg 

61801 ggaagagaaa cggaaggatc atcatcagat ccgagatgac aaagcgggaa ttgacactga 

61861 ggatcgtact tcatatggag agtaatcggc cgacgagtct tggttccgcc gccgacatgg 

61921 ttaaattgaa . accccataag agcttccaca agcgcacttt taccgtcggt ctgctgtccc 

61981 accacaagaa ccgccggtgc ttcgaacggc gtctccaatt cctgcgccaa agcgtgtaac 

62041 tcgttgtaag cttcgtaaag actccaccgc tcctcaatcg cagcgtcgtc ctcttccgcc 
62101 . atttcctcaa ccgtcaccga ttttgctgat acttccgcca tcgtctctta cgaaaatgag 

62161 caagaggaag agtaagagta agagagtgtc tcttatttct tctactcttt agttttcgtc 

62221 gccgttcctt tttccgccat ggaattagca gatacggcta atttcaattt ttgtcaaaag 

62281 aaatattttt tgtgttttaa tctcacgcgc atecatggcg cgttgagtca acgttgtaat 

62341 agttctccgc taaatttaaa taaaagagcg cgtaaggaga gagtttaagg attttttttt 

62401 tttggtcggc aaatacaaag gatttgcttt gtcttgacca atagtatatg cagaaatatt 
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62461 atctcaaagg atttgtgata actatgtagt 

62521 gaaatatttt gagcaaatga cgacttgtta 

62581 ctcttgtgta actatgatta aaattgaaat 

62641 atcaacgagt tctattgtta aagagagaca 

62701 agccgaaagg agacttagtt cgggtccctc 

62761 attcgttcat tgtcgtcgcg tcgtatgcca 

62821 atgacgcaat ttcgataatg caaacaccag 

62881 tgttttttaa gagattgatc aataccttag 



acagaattgt gattattgga tgagaaacca 
atttactatt ttttcatttc ttaaaggtct 
agtgactttt attgttacga catggaacaa 
ttgatgaatg taacaaaact gtggcttaga 
cttcaccgta ttgctcgttc cattttctca 
ctgacggact tacctgcaaa ttacattaca 
gggaaaaaac atgaatagag atgatgatga 
ctttggattg aatgaagtcg tccaaactca 



/3f 
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SEQ ID NO:15 



1 atggcggaag tatcagcaaa atcggtgacg gttgaggaaa tggcggaaga ggacgacgct 
61 gcgattgagg agcggtggag tctttacgaa gcttacaacg agttacacgc tttggcgcag 
121 gaattggaga cgccgttcga agcaccggcg gttcttgtgg tgggacagca gaccgacggt 
181 aaaagtgcgc ttgtggaagc tcttatgggg tttcaattta accatgtcgg cggcggaacc 
241 aagactcgtc ggccgattac tctccatatg aagtacgatc ctcagtgtca attcccgctt 
301 tgtcatctcg gatctgatga tgatccttcc gtttctcttc ccaaatctct ctcacaaatt 
361 caggcatata ttgaggctga gaacatgagg ctggagcaag agccatgtag cccattctct 
421 gcaaaggaga ttattgtgaa agtccagtat aagtattgtc caaaccttac catcattgat 
481 acacctggac ttattgctcc tgcaccagga ctgaaaaacc gagctcttca ggttcaagca 
541 cgggctgtgg aagctctagt ccgagcaaag atgcaacaca aagagttcat cattttatgc 
601 ctcgaagata gcagtgactg gagcattgca accactcgaa ggatagtgat gcaagttgat 
661 cctgagcttt ctaggacaat tgttgtttct acaaagcttg acactaaaat ccctcaattc 
721 tcatgttcat ctgacgtgga agtctttctc tcacctcctg caagcgcact tgacagctcc 
781 ttattgggcg attctccttt tttcacgtct gtgccttctg gaagagttgg ctatggacag 
841 gattcagtgt ataagtctaa tgacgagttc aaacaggctg tgtcacttag agaaatggaa 
901 gacattgcat ctttagagaa gaagttgggc cgtttactga caaaacagga aaagagtagg 
961 attggcatca gtaaactgag gttgtttctg gaagaactac tctggaaaag gtacaaagag 
1021 agtgttccat tgatcattcc actgttagga aaggagtacc gcagtacagt cagaaagctg 
1081 gataccttat cgctgttatt gaagggaaca gttgtggccc ctccagataa atttggtgag 
1141 acactgcaag atgaaaggac acaaggagga gcatttgttg gtactgatgg tctccagttt 
1201 tcacataagc taataccgaa tgcagggatg cgtctctatg ggggtgcaca atatcaccgt 
1261 gccatggctg agtttcgttt tctagttggt gctatcaaat gtcccccaat aacgagggag 
1321 gaaattgtaa atgcatgtgg agttgaggat attcatgatg gaacaaacta ttccagaaca 
1381 gcttgtgtta tagcagttgc gaaggctcgt gagacgtttg aacctttcct tcatcagaaa 
1441 gttttttcca gttctcattt tcgtttgttt tgcgttgata tagttagggg cgaggcttct 
1501 acacattctc aagagattgc ttccaatttc tgtatatctt cttcaggtag gtactgtttt 
1561 ctttggtttg acggtgaata tttaagtggg catgaggtgt ttctcaagcg ggttgcttca 
1621 gcattcaaca gttttgtgga gtccacagaa aaatcatgtc gtgacaaatg tatggaggat 
1681 ttagcaagta caactcgcta tgttacatgg tctcttcaca acaagaaccg agctggtcta 
1741 cgtcaattct tggactcatt tggtggaaca gagcataata cgacatcagg taatgccata 
1801 ggatttagtc ttccccaaga tgcattaggt ggcacaacag acaccaagtc aagatcagat 
1861 gtaaagctaa gccatctcgc ctcaaacatc gattcaggtt ccagtattca gacaacagaa 
1921 atgcggttgg ctgatcttct agatagcaca ctttggaacc gcaagcttgc tccttcctct 
1981 gagagaattg tgtacgcatt ggtccaacag atattccagg gcatacgaga gtactttctc 
2041 gcctctgctg agttaaagtt caactgtttt cttctaatgc ccatcgttga taagttacct 
2101 gctcttctcc gggaagagtt ggaaaacgca tttgaagacg acctcgatag tatcttcgac 
2161 atcacgaatc tccggcaatc acttgatcaa aagaaacgga gcacagagat cgagctcaga 
2221 aggataaaga ggataaaaga gaaattcaga gtgatgaatg agaagctaaa ctctcatgaa 
2281 tttgctcaaa atctaaaggc tccttcggtg cagcattga 
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SEQ ID NO:16 

MAEVSAKSVTVEEMAEEDDAAIEERWSLYEAYNELHALAQELETPFEAPAVLWGQQTD 
GKSALVEALMGFQFNHVGGGTKTRRPITLHMKYDPQCQFPLCHLGSDDDPSVSLPKSLS 
QIQAYIEAENMRLEQEPCSPFSAKEI IVKVQYKYCPNLTI IDTPGLIAPAPGLKNRALQ 
VQARAVEALVRAKMQHKEFIILCLEDSSDWSIATTRRIVMQVDPELSRTIWSTKLDTK 
IPQFSCSSDVEVFLSPPASALDSSLLGDSPFFTSVPSGRVGYGQDSVYKSNDEFKQAVS 
LREMEDIASLEKKLGRLLTKQEKSRIGISKLRLFLEELLWKRYKESVPLI IPLLGKEYR 
STVRKLDTLSLLLKGTWAPPDKFGETLQDERTQGGAFVGTDGLQFSHKLIPNAGMRLY 
GGAQYHRAMAEFRFLVGAI KCPP ITREE I VNACGVED I HDGTNYSRTACVI AVAKARET 
FEPFLHQKVFSSSHFRLFCVDIVRGEASTHSQEIASNFCISSSGRYCFLWFDGEYLSGH 
EVFLKRVAS AFNS FVE S TE KS CRDKCMEDLASTTRY VTW S LHNKNRAGLRQFLDS FGGT 
EHNTTSGNAIGFSLPQDALGGTTDTKSRSDVKLSHLASNIDSGSSIQTTEMRLADLLDS 
TLWNRKLAPS SER I VYALVQQI FQG I RE YFLAS AELKFNCFLLMP I VDKLPALLREELE 
NAFEDDLDSIFDITNLRQSLDQKKRSTEIELRRIKRIKEKFRVMNEKLNSHEFAQNLKA 
PSVQH 



Fig. 15 
SEQIDNO:17 

MQELYTNRTVLNRPRFAVNVRPTRLKRNQQSQS KMQSHS KDP IN 

AESRSRFEAYNRLQAAAVAFGEKLPIPEIVAIGGQSDGKSSLLEALLGFRFNVREVEM 
GTRRPLILQMVHDLSALEPRCRFQISRIFFVELAILITDLDEDSEEYGSPIVSATAVA 
DVIRSRTEALLKKTKTAVSPKPIVMRAEYAHCPNLTIIDTPGFVLKAKKGEPETTPDE 
ILSMVKSLASPPHRILLFLQQSSVEWCSSLWLDAVREIDSSFRRTIWVSKFDNRLKE 
FSDRGEVDRYLSASGYLGENTRPYFVALPKDRSTISNDEFRRQISQVDTEVIRHLREG 
VKGGFDEEKFRSCIGFGSLRDFLESELQKRYKEAAPATLALLEERCSEVTDDMLRMDM 
KIQATSDVAHLRKAAMLYTASISNHVGALIDGAANPAPEQWGKTTEEERGESGIGSWP 
GVSVDIKPPNAVLKLYGGAAFERVIHEFRCAAYSIECPPVSREKVANILLAHAGRGGG 
RGVTEASAEIARTAARSWLAPLLDTACDRLAFVLGSLFEIALERNLNQNSEYEKKTEN 
MDGYVGFHAAVRNCYSRFVKNLAKQCKQLVRHHLDSVTSPYSMACYENNYHQGGAFGA 
YNKFNQASPNSFCFELSDTSRDEPMKDQENIPPEKNNGQETTPGKGGESHITVPETPS 
PDQPCEIVYGLVKKEIGNGPDGVGARKRMARMVGNRNIEPFRVQNGGLMFANADNGMK 
SSSAYSEICSSAAQHFARIREVLVERSVTSTLNSGFLTPCRDRLWALGLDLFAVNDD 
KFMDMFVAPGAIWLQNERQQLQKRQKILQSCLTEFKTVARSL" 
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SEQ ID NO:18 

MANSNTYLTTPTKTPSSRRNQQSQSKMQSHSKDPINAESRSRFEAYNRLQAAAVAFGEK 
LPIPEIVAIGGQSDGKSSLLEALLGFRFNVREVEMGTRRPLILQMVHDLSALEPRCRFQ 
DEDSEEYGSPIVSATAVADVIRSRTEALLKKTKTAVSPKPIVMRAEYAHCPNLTIIDTP 
GFVLKAKKGEPETTPDEILSMVKSLASPPHRILLFLQQSSVEWCSSLWLDAVREIDSSF 
RRTIWVSKFDNRLKEFSDRGEVDRYLSASGYLGENTRPYFVALPKDRSTISNDEFRRQ 
ISQVDTEVIRHLREGVKGGFDEEKFRSCIGFGSLRDFLESELQKRYKEAAPATLALLEE 
RCSEVTDDMLRMDMKIQATSDVAHLRKAAMLYTAS I SNHVGAL I DGAANPAPEQWGKTT 
EEERGESGIGSWPGVSVDIKPPNAVLKLYGGAAFERVIHEFRCAAYSIECPPVSREKVA 
NI LLAHAGRGGGRGVTEASAE I ARTAARS WLAPLLDTACDRLAFVLGS L FE I ALERNLN 
QNSEYEKKTENMDGYVGFHAAVRNCYSRFVKNLAKQCKQLVRHHLDSVTSPYSMACYEN 
NYHQGGAFGAYNKFNQAS PNS FCFELSDTSRDE PMKDQEN I PPEKNNGQETTPGKGGES 
H I TVPETP S PDQPCE I VYGLVKKE I GNGPDGVGARKRMARMVGNRNI E P FRVQNGGLMF 
ANADNGMKSSSAYSEICSSAAQHFARIREVLVERSVTSTLNSGFLTPCRDRLWALGLD 
LFAVNDDKFMDMFVAPGAIWLQNERQQLQKRQKILQSCLTEFKTVARSL 



Fig. 17 
SEQ ID NO:19 



1 ttcatgttct tagaagttct aaattttgat 
61 gctatgatat cattccctga tgctacgtac 
121 tataattaaa acttgttaaa ttcatacaca 
181 gatggaggct ctggaacatc tagtggtgcc 
241 tgtctttgtt tggaagaagt aaatttaatt 
301 gtttttctcc cttttcgtgg tatactttgg 
361 cgttaattca tgtgtttgaa aagtaattaa 
421 ggccaatagg atatttaaga gataagaaaa 
481 ctctctctct ttctctctcc . ATGAGAACTC 
541 CGTTTCTTAT CTCCGCCGCA TCTCCACCGT 
601 TTACTCCTCC ACGTCATAGG CGTTTTTCTT 
661 CCGCCGATCA GACTTCTTCT TCTAGGCCGC 
721 CCGAACTCGC CGTTCCCGGT TTACTTCTCC 
781 ATCGTGAAGA GACTCTTGAT TTGGTCGACC 
841 TGATTGATGG CGGAGCCACC GCTGGTAAGC 
901 TTGTCAAAGG CCGTGCTTAC CTCTTGATCG 
961 GTGCTAGTGG TGTTGCTCTC TCCGACGAAG 
1021 ttaatttctc atagagtgag ttttgtctct 
1081 TGGCGAGAAA CACATTGATG GGATCCAACG 
1141 GGATTGTGAA GGATGTTGAT TCTGCTCTAA 
12 01 TTATACTTGG ATCTGGTGAA GAAGATACGC 

12 61 AAATACCGAT ATATGTGACT TGCAGAGGCA 

13 21 TGAAATCAGG TGTTTCTGGT TTTGTTATTT 
13 81 TAGCTCTTCG CCAGAGTCTT GATGGAGCTT 
1441 TGAATGAACT GCCGGAGAAA AAGAATTCTG 

15 01 AACTAATAGT AGAAATGGAG AAATCTGTGT 
1561 CGGCTCCACT Ggtgattttt atttcaaaca 
1621 gttctaagta ggtttttgtg tggttataat 

16 81 cattaacagA TGGAGGAAGT CTCCCTTCTA 
1741 TTTCTGATGG TTATAGTGgt aattctgcac 
1801 ttgcattggt attagctcta tattcattcc 
1861 cactagatag cttgagata.c aatgggcatg 
1921 aatatcttct ttcgtcgcct atgactatga 
1981 cttctcttct taatttgctt atggatctgg 
2041 accatctgct tgtgtacata gttttttcgc 
2101 ggaagatgtt ttaagtggga caagttgcct 
2161 attaattgga atccacaatt tgctggtact 
2221 TTATCAATGC ACTTCTTGGG AAG AG AT AC C 
22 81 AAATCACGTT TCTGTGCTAC TCTGACTTGG 
2341 ATCCAGATGG C C AATATGT A TGCTATCTTC 
2401 caaaattcta ccatcgcagt cctgaatttt 
2461 tgttctcctt tcgagcagAT AAATATTGTT 
2 521 AGGCAACAGC GTCTTACAGA AGAATTTGTT 
2 581 TCTGCTGACC GCCCTTTAAC TGAAAGTGAG 
2641 gttgttgttg tttttgctca atatgtatct 
2701 cgaaagtagt tagttaagtc atgtatagac 
2761 tgtcactagg ttgaatgcat atatcaaggt 
2821 tttattttca aagtaatgag tgttatagct 
2881 atattttggt aaagcttagg ccaatacatt 
2941 actgatttta cgtccatggc aaattgtatg 



catctcttat ttgaaagctc aactaaaata 
taggttttta aattcataca cacacaaatc 
caaaggacaa atcttcttcg tattaaaaaa 
gtatcactta cttgactggt tcaagccgtt 
9t999 a 9 a 99 gatttcacga atttaaatct 
accttttgga tatgaacaca tatgtgaaaa 
tcgcgccgtc cgtcttatag ctttgggatg 
ctaatcagaa acacagacga aggtatctca 
TAATCTCTCA CCGGCAATGT GTGACGTCAC 
TTCCTGGCCG GTGCTTTAAG TTATCCTCCT 
CTCTCTCGAT CAGAAACATT TCGCATGAAT 
GAACTCTTTA TCCTGGTGGT TACAAGCGTC 
GGCTAGACGC CGACGAGGTT ATGAGCGGGA 
GTGCTTTAGC TAAATCGGTT CAAATCGTCG 
TCTACGAGGC GGCTTGTTTG CTGAAATGAC 
CTGAACGTGT TGATATCGCC TCCGCCGTTG 
gtaacaactg atttcattca gttttagcat 
caatgctatg tacagGTCTT CCGGCGATTG 
CCGACTCGGT ACTTCTTCCA CTGGTAGCTC 
TTGCCTCAAG CTCCGAGGGT GCTGATTTCC 
AAGTGGCGGA TTCTTTGTTG AAGAGCGTGA 
ATGAAGAAGC TAAAGAAGAA TTGCAGTTAC 
CGTTGAAAGA TTTGCGTTCT TCTAGGGATG 
ATGTTGTAAA TAAT CATGAG ACACAAAATA 
CTGGCTTCAT AAAATTAGAG GACAAACAGA 
TGAGAGAGAC GATTGAAATC AT C C AC AAGG 
tttggtagtt gaagtcaatt ttttgaaatg 
atggtttcat ttacttcttc gactattttt 
ATTGATGCTG TTTCTCGGAT CGATGAGCCG 
tcaactccgt caaattgtga ttccaggaat 
agaaacattt tagttacaca cttttgccag 
cttctagtca cttgtccttt agtgcttctc 
tgtttcgctt cttcttttgt tctgtctatg 
ttgtaaggga actgcatatt tcttaactgt 
tttcttgtga cttgtgagta tgccgttctt 
ttatgattca. aaatagtttt tgtatggata 
agGGGGAATT TAACTCTGGA AAATCAACGG 
TGAAAGAAGG GGTAGTCCCC ACT AC C AATG 
AATCCGAAGA GCAACAACGT TGCCAAACAC 
CTGCACCAAT ACTTAAGGAT gtgagtaatt 
tactaattat ttggaggaat tgatttgggt 
GACACACCTG GG AC CAATGT GATCCTTCAA 
CCACGTGCAG ATTTGCTTGT TTTTGTTCTT 
gtagaagtta ccgttttact tggcatgtta 
gcctaagtag cttgttagat ctatttttca 
catcaagacc ttgtgtaggg aagggaaagt 
tttgttgatt ataaatttaa actagactaa 
attgctggaa ccagtatgtc ctgttggtcc 
tgagaggtga gttgttattg gtacagcaaa 
taaatgatca tctacgaagt actaacctta 

73? 
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3001 tgaatatttg gttcttattt 
3061 caatgtcatc atacccatgt 
3121 tttactgtag gctgattt.ac 
3181 CAGCAGTGGA AAAAGAAATT 
3241 CGTGAGgttt atcagaaaca 
3301 atttttcttg actaaagatt 
3361 AAAGAGAATA CACGGAAGTT 
3421 CGGTCTGCTC TTGAGGCGAA 
34 81 ATCGCAGATC CTGGTTCTAA 
3 541 TATAGCTTCT TGGATAGCTC 
3 601 ACACCCATGG CGATTGCTGA 
3 661 TGCCTAGCTG CTAGGGAAGA 
3 721 TACGCGCTTA AGATGGAATA 
3781 TAAattctat tagatattat 
3841 tggctttttt gtgttttgta 
3 901 ttgttgatct gataggaact 
3 961 tgttcaaagg ggaaaaatcg 
4021 tcgctccagc actcacaaat 
4081 taactgggga catgttggtc 
4141 aagtaaatat tgtctaacaa 
4201 gcaggaattg cttggaaaat 
4261 tctgtctctg aaatcattcg 



tgaaaatctg aaaaagtttc 
catttctatc tctacctctg 
atctcattgc gtttgtcagG 
TGTGTTTATT CTGAATAAAT 
atatttatgt cttttccttg 
aattttactg ctgcagCTTG 
GCTTAATACA GAAAATGTGA 
GCTTTCAACA GCTTCTTTGG 
TTGGAGAGTC CAGAGCTTCA 
AACAGCTACC GGGATGGAGA 
GCGTCTCCTT TCTTCTGTGG 
CTTGGCTTCA GCAGACAAGA 
TGAGAGCATT TCTTGGAGAA 
cttgttgaat cacgaaggag 
ctctggcttt tatcgcagat 
accctgcgac tatcaagcct 
gcctcagtag cagctacatc 
gcgaaagtaa gtgtgatgct 
atatatatga ggtctgagat 
tgtcttgttg tgacctgatt 
atgctgaatg gctacaatca 
aaa 



aaaagaagga ataagcttct 
gagcttcctg ctgtcttgat 
TTGCGTTTCT CCGGTACACA 
CTGATATCTA TCGTGATGCT 
atagtctctg taattgctgg 
AGGAAGCTAT TTCATTTGTT 
TATTGTATCC GGTGTCCGCA 
TTGGCAGAGA TGATCTTGAG 
ATGAACTTGA GAAATTTCTT 
GAATAAGGCT TAAATTGGAG 
AAGCTCTTGT GAGACAAGAT 
TTATCAGTCG AACTAAAGAA 
GGCAGGCTCT CTCGTTGGTA 
gaaattggat tgttctaact 
tgataatgcc agattacaag 
tgatcttgcg atctcgtacg 
caaagttcaa ggtgaaatac 
ttattctttg agtattggcc 
atagtcacta ' ttcatgcaga 
gactttacat ttcactgttt 
aataetgccc gtgaagggag 



Fig. 18 



SEQ ID NO:20 



1 ATGAGAACTC TAATCTCTCA CCGGCAATGT 
61 TCTCCACCGT TTCCTGGCCG GTGCTTTAAG 
121 CGTTTTTCTT CTCTCTCGAT CAGAAACATT 
181 TCTAGGCCGC GAACTCTTTA TCCTGGTGGT 
241 TTACTTCTCC GGCTAGACGC CGACGAGGTT 

3 01 TTGGTCGACC GTGCTTTAGC TAAATCGGTT 
361 GCTGGTAAGC TCTACGAGGC GGCTTGTTTG 
421 CTCTTGATCG CTGAACGTGT TGATATCGCC 

4 81 TCCGACGAAG GTCTTCCGGC GATTGTGGCG 
541 TCGGTACTTC TTCCACTGGT AGCTCGGATT 
601 TCAAGCTCCG AGGGTGCTGA TTTCCTTATA 
661 GCGGATTCTT TGTTGAAGAG CGTGAAAATA 
721 GAAGCTAAAG AAGAATTGCA GTTACTGAAA 
781 AAAGATTTGC GTTCTTCTAG GGATGTAGCT 
841 GTAAATAATC ATGAGACACA AAATATGAAT 
901 TTCATAAAAT TAGAGGACAA ACAGAAACTA 
961 GAGACGATTG AAATCATCCA CAAGGCGGCT 

1021 GATGCTGTTT CTCGGATCGA TGAGCCGTTT 
1081 GGAAAATCAA CGGTTATCAA TGCACTTC-TT 
1141 CCCACTACCA ATGAAATCAC GTTTCTGTGC 
12 01 CGTTGCCAAA CACATCCAGA TGGCCAATAT 

12 61 GATATAAATA TTGTTGACAC ACCTGGGACC 
1321 ACAGAAGAAT TTGTTCCACG TGCAGATTTG 

13 81 TTAACTGAAA GTGAGGTTGC ■ GTTTCTCCGG 
1441 TTTATTCTGA ATAAATCTGA TATCTATCGT 
15 01 TTTGTTAAAG AGAATACACG GAAGTTGCTT 
1561 TCCGCACGGT CTGCTCTTGA GGCGAAGCTT 
1621 CTTGAGATCG CAGATCCTGG TTCTAATTGG 
1681 TTTCTTTATA GCTTCTTGGA TAGCTCAACA 
1741 TTGGAGACAC CCATGGCGAT TGCTGAGCGT 
18 01 CAAGATTGCC TAGCTGCTAG GGAAGACTTG 
1861 AAAGAATACG CGCTTAAGAT GGAATATGAG 
1921 TTGGTATAA 



GTGACGTCAC CGTTTCTTAT CTCCGCCGCA 
TTATCCTCCT TTACTCCTCC ACGTCATAGG 
TCGCATGAAT CCGCCGATCA GACTTCTTCT 
TACAAGCGTC CCGAACTCGC CGTTCCCGGT 
ATGAGCGGGA ATCGTGAAGA GACTCTTGAT 
CAAATCGTCG TGATTGATGG CGGAGCCACC 
CTGAAATCAC TTGTCAAAGG CCGTGCTTAC 
TCCGCCGTTG GTGCTAGTGG TGTTGCTCTC 
AGAAACACAT TGATGGGATC CAACCCCGAC 
GTGAAGGATG' TTGATTCTGC TCTAATTGCC 
CTTGGATCTG GTGAAGAAGA TACGCAAGTG 
CCGATATATG TGACTTGCAG AGGCAATGAA 
TCAGGTGTTT CTGGTTTTGT TATTTCGTTG 
CTTCGCCAGA GTCTTGATGG AGCTTATGTT 
GAACTGCCGG AGAAAAAGAA TTCTGCTGGC 
ATAGTAGAAA TGGAGAAATC TGTGTTGAGA 
CCACTGATGG AGGAAGTCTC CCTTCTAATT 
CTGATGGTTA TAGTGGGGGA ATTTAACTCT 
GGGAAGAGAT ACCTGAAAGA AGGGGTAGTC 
TACTCTGACT TGGAATCCGA AGAGCAACAA 
GTATGCTATC TTCCTGCACC AATACTTAAG 
AATGTGATCC TTCAAAGGCA ACAGCGTCTT 
CTTGTTTTTG TTCTTTCTGC TGACCGCCCT 
T AC AC AC AG C AGTGGAAAAA GAAATTTGTG 
GATGCTCGTG AGCTTGAGGA AGCTATTTCA 
AATACAGAAA ATGTGATATT GTATCCGGTG 
TCAACAGCTT CTTTGGTTGG CAGAGATGAT 
AGAGTCCAGA GCTTCAATGA ACTTGAGAAA 
GCTACCGGGA TGGAGAGAAT AAGGCTTAAA 
CTCCTTTCTT CTGTGGAAGC TCTTGTGAGA 
GCTTCAGCAG ACAAGATTAT CAGTCGAACT 
AGCATTTCTT GGAGAAGGCA GGCTCTCTCG 



Fig. 19 



MRTLISHRQC VTSPFLISAA 
SRPRTLYPGG YKRPELAVPG 
AGKLYEAACL LKSLVKGRAY 
SVLLPLVARI VKDVDSALIA 
EAKEELQLLK SGVSGFVISL 
FIKLEDKQKL IVEMEKSVLR 
GKSTVINALL GKRYLKEGW 
DINIVDTPGT NVILQRQQRL 
FILNKSDIYR DARELEEAIS 
LEIADPGSNW RVQSFNELEK 
QDCLAAREDL ASADKIISRT 



SEQ ID NO:21 



SPPFPGRCFK LSSFTPPRHR 
LLLRLDADEV MSGNREETLD 
LLIAERVDIA SAVGASGVAL 
SSSEGADFLI LGSGEEDTQV 
KDLRSSRDVA LRQSLDGAYV 
ETIEIIHKAA PLMEEVSLLI 
PTTNEITFLC YSDLESEEQQ 
TEEFVPRADL LVFVLSADRP 
FVKENTRKLL NTENVILYPV 
FLYSFLDSST ATGMERIRLK 
KEYALKMEYE SISWRRQALS 



RFSSLSIRNI SHESADQTSS 
LVDRALAKSV QIWIDGGAT 
SDEGLPAIVA RNTLMGSNPD 
ADSLLKSVKI PIYVTCRGNE 
VNNHETQNMN ELPEKKNSAG 
DAVSRIDEPF LMVIVGEFNS 
RCQTHPDGQY VCYLPAPILK 
LTESEVAFLR YTQQWKKKFV 
SARSALEAKL STASLVGRDD 
LETPMAIAER LLSSVEALVR 
LV 
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SEQ ID NO:22 

1 actgtcacaa agaactagaa aaggcaagca aaactcaact atgtcaaaag tgtcacttag 
61 attgattctt gaatagcgag acgaagtatc tgggaaaata cggtactgaa ttaacatctc 
121 cgtcagatca taggttcgga ttgaacagat gacacaatta aacaatgatg aagatcaaga 
181 cactttaatc gactgaattc tagttagaac ttagactaaa agtatttaat acttgaagct 
241 caccacttct cgaatatctt gttccaatcg ttttgatgtg gttccggcac tcaagttctg 
301 tattgttttc aagctgactt tatcagtttt ctgaagtaag tcatatgtgt ctatgcccaa 
361 ttgcgttttt gaattgacat atgttggcca tttgttttcg aatgatttca gagacagact 
421 cccttcacgg gcagtatttg attgtagcca ttcagcatat tttccaagca attcctgcaa 
481 acagtgaaat gtaaagtcaa tcaggtcaca acaagacatt gttagacaat atttactttc 
541 tgcatgaata gtgactatat ctcagacctc atatatatga ccaacatgtc cccagttagg 
601 ccaatactca aagaataaag catcacactt actttcgcat ttgtgagtgc tggagcgagt 
661 atttcacctt gaactttgga tgtagctgct actgaggccg atttttcccc tttgaacacg 
721 tacgagatcg caagatcaag gcttgatagt cgcagggtag ttcctatcag atcaacaact 
781 tgtaatctgg cattatcaat ctgcgataaa agccagagta caaaacacaa aaaagccaag 
841 ttagaacaat ccaatttcct ccttcgtgat tcaacaagat aatatctaat agaatttata 
901 ccaacgagag agcctgcctt ctccaagaaa tgctctcata ttccatctta agcgcgtatt 
961 ctttagttcg actgataatc ttgtctgctg aagccaagtc ttccctagca gctaggcaat 
1021 cttgtctcac aagagcttcc acagaagaaa ggagacgctc agcaatcgcc atgggtgtct 
1081 ccaatttaag ccttattctc tccatcccgg tagctgttga gctatccaag aagctataaa 
1141 gaaatttctc aagttcattg aagctctgga ctctccaatt agaaccagga tctgcgatct 
1201 caagatcatc tctgccaacc aaagaagctg ttgaaagctt cgcctcaaga gcagaccgtg 
1261 cggacaccgg atacaatatc acattttctg tattaagcaa cttccgtgta ttctctttaa 
1321 caaatgaaat agcttcctca agctgcagca gtaaaattaa tctttagtca agaaaaatcc 
1381 agcaattaca gagactatca aggaaaagac ataaatattg tttctgataa acctcacgag 
1441 catcacgata gatatcagat ttattcagaa taaacacaaa tttctttttc cactgctgtg 
1501 tgtaccggag aaacgcaacc tgacaaacgc aatgagatgt aaatcagcct acagtaaaat 
1561 caagacagca ggaagctcca gaggtagaga tagaaatgac atgggtatga tgacattgag 
1621 aagcttattc cttcttttga aactttttca gattttcaaa ataagaacca aatattcata 
1681 aggttagtac ttcgtagatg atcatttaca tacaatttgc catggacgta aaatcagttt 
1741 tgctgtacca ataacaactc ■ acctctcaaa tgtattggcc taagctttac caaaatatgg 
1801 accaacagga catactggtt ccagcaatag ctataacact cattactttg aaaataaatt 
1861 agtctagttt aaatttataa tcaacaaaac cttgatatat gcattcaacc tagtgacaac 
1921 tttcccttcc ctacacaagg tcttgatggt ctatacatga cttaactaac tactttcgtg 
1981 aaaaatagat ctaacaagct acttaggcag atacatattg agcaaaaaca acaacaacta 
2041 acatgccaag taaaacggta acttctacct cactttcagt taaagggcgg tcagcagaaa 
2101 gaacaaaaac aagcaaatct gcacgtggaa caaattcttc tgtaagacgc tgttgccttt 
2161 gaaggatcac attggtccca ggtgtgtcaa caatatttat ctgctcgaaa ggagaacaac 
2221 ccaaatcaat tcctccaaat aattagtaaa aattcaggac tgcgatggta gaattttgaa 
2281 ttactcacat ccttaagtat tggtgcagga agatagcata catattggcc atctggatgt 
2341 gtttggcaac gttgttgctc ttcggattcc aagtcagagt agcacagaaa cgtgatttca 
2401 ttggtagtgg ggactacccc ttctttcagg tatctcttcc caagaagtgc attgataacc 
2461 gttgattttc cagagttaaa ttccccctag taccagcaaa ttgtggattc caattaatta 
2521 tccatacaaa aactattttg aatcataaag gcaacttgtc ccacttaaaa catcttccaa 
2581 gaacggcata ctcacaagtc acaagaaagc gaaaaaacta tgtacacaag cagatggtac 
2641 agttaagaaa tatgcagttc ccttacaacc agatccataa gcaaattaag aagagaagca 
2701 tagacagaac aaaagaagaa gcgaaacatc atagtcatag gcgacgaaag aagatattga 
2761 gaagcactaa aggacaagtg actagaagca tgcccattgt atctcaagct atctagtgct 
2821 ggcaaaagtg tgtaactaaa atgtttctgg aatgaatata gagctaatac caatgcaaat 
2881 tcctggaatc acaatttgac ggagttgagt gcagaattac cactataacc atcagaaaeg 
2 941 gctcatcgat ccgagaaaca gcatcaatta gaagggagac ttcctccatc tgttaatgaa 
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3 001 aaatagtcga agaagtaaat gaaaccatat tataaccaca caaaaaccta cttagaacca 
3061 tttcaaaaaa ttgacttcaa ctaccaaatg tttgaaataa aaatcaccag tggagccgec 
3121 ttgtggatga tttcaatcgt ctctctcaac acagatttct ccatttctac tattagtttc 
3181 tgtttgtcct ctaattttat gaagccagca gaattctttt tctccggcag ttcattcata 
3241 ttttgtgtct catgattatt tacaacataa gctccatcaa gactctggcg aagagctaca 
33 01 tccctagaag aacgcaaatc tttcaacgaa ataacaaaac cagaaacacc tgatttcagt 
3361 aactgcaatt cttctttagc ttcttcattg cctctgcaag tcacatatat cggtattttc 
3421 acgctcttca acaaagaatc cgccacttgc gtatcttctt caccagatcc aagtataagg 
3481 aaatcagcac cctcggagct tgaggcaatt agagcagaat caacatcctt cacaatccga 
3541 gctaccagtg gaagaagtac cgagtcgggg ttggatccca tcaatgtgtt tctcgccaca 
3601 atcgccggaa gacctgtaca tagcattgag agacaaaact cactctatga gaaattaaat 
3661 gctaaaactg aatgaaatca gttgttacct tcgtcggaga gagcaacacc actagcacca 
3721 acggcggagg cgatatcaac acgttcagcg atcaagaggt aagcacggcc tttgacaagt 
3781 gatttcagca aacaagccgc ctcgtagagc ttaccagcgg tggctccgcc atcaatcacg 
3841 acgatttgaa ccgatttagc taaagcacgg tcgaccaaat caagagtctc ttcacgattc 
3901 ccgctcataa cctcgtcggc gtctagccgg agaagtaaac cgggaacggc gagttcggga 
3961 cgcttgtaac caccaggata aagagttcgc ggcctagaag aagaagtctg atcggcggat 
4021 tcatgcgaaa tgtttctgat cgagagagaa gaaaaacgcc tatgacgtgg aggagtaaag 
4081 gaggataact taaagcaccg gccaggaaac ggtggagatg cggcggagat aagaaacggt 
4141 gacgtcacac attgccggtg agagattaga gttctcatgg agagagaaag agagagagtg 
4201 agataccttc gtctgtgttt ctgattagtt ttcttatctc* ttaaatatcc tattggccca 
4261 tcccaaagct ataagacgga cggcgcgatt aattactttt caaacacatg aattaacgtt 
4321 ttcacatatg tgttcatatc caaaaggtcc aaagtatacc acgaaaaggg agaaaaacag 
43 81 atttaaattc gtgaaatccc tctcccacaa ttaaatttac ttcttccaaa caaagacaaa 
4441 cggcttgaac cagtcaagta agtgatacgg caccactaga tgttccagag cctccatctt 
45 01 ttttaatacg aagaagattt gtcctttgtg tgtatgaatt taacaagttt taattataga 
4561 tttgtgtgtg tatgaattta aaaacctagt acgtagcatc agggaatgat atcatagcta 
4621 ttttagttga gctttcaaat aagagatgat caaaatttag aacttctaag aacatgaacg 
4681 aataaacaac tattttcttt tcaaaccaac taaggtagat ggtcactgaa agtatataca 
4741 tcagataaaa gttgcttgtt attccagatg aagttggacc gagaaaaaaa aaagttactt 
4801 gttattcaat atgtttggat ctttgtcttg cagattgcta tatagggttg ataatgggct 
4861 tcgttgtaat gggtatacag tgtataagaa tcggccttgt gcaaccaatc ctaatatgtg 
4921 tgtctcatta aggtaagtgc ttaagattag aagagtaaaa cacttgactt atcaactatg 
4981 tcaactaagg gttctatatt tttattaaat aaaaaataat tgaatatttt ttagaatgat 
5041 ttaataaatt taatgctatt gtttgattta aatgtataat tcaccgcgag aagaaatttt 
5101 ataactcaaa ttttaaagtt ttaagttgta tttgtttatt ttgttaaatg tttaatattg 
5161 tataattgta ttttgattgt tgtttctcgg atttcacccg tagtacatca tcccatatta 
5221 atatcgaatc aaacccgtca attctaaaat ttcacccgtg gtagtattta attgtataat 
5281 tatattttaa ttgtcattct aagatttcac tcctaattct atcgcaaatt attatcaacc 
5341 caaaccagtc aattctaaaa tatcacccgt agtacaccat cccatattaa tatcgaatca 
54 01 agcccgtcaa ttctaggatt tcacccgtgg tagtatttaa ttgtataatt atattttaat 
5461 tgtcattcta ggatttcact cctaattcta tcgcaaatta ttatcaaccc aaaccagtca 
5521 attctaaaat atcacccgta gtacaccatc ccatattaat atcgattcaa actcgtcaat 
5581 tctaggattt cgctcgtggt agtatttaat tgtataatta tattttaatt gtcattttaa 
5641 ctcctagttc tatcgcaaat tcttatcaac ccaaacagtc aattctaaaa tttcacccgt 
5701 agtataaagt ttaaatattt ataatattta aatttcttat aaaagaatca aaatgtgttt 
5761 taaaaaaatt aaagttttaa gttttttttt tttaatattg ttaattttgt ttagtgttta 
5821 agattatata attacattat gattgtcatt atatgttttt ctccatagca tactatccca 
5881 tgttattatc cactcaaacc tgtcacacca tataaccccg tcccgtgaaa ttaaacacaa 
5941 atttgtcatt ttattataaa tttcaaatat ttataaaatt agaaacttca aaaaagatta 
6001 atattgaccc aaacttcatc attgaatttt gagtgttata tctaagattt ctctcgcaat 
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SEQ ID NO:23 



1 atggaggctc tggaacatct agtgctttgg gatgggccaa taggatattt aagagataag 
61 aaaactaatc agaaacacag acgaaggtat ctcactctct ctctttctct ctccatgaga 
121 actctaatct ctcaccggca atgtgtgacg tcaccgtttc ttatctccgc cgcatctcca 
181 ccgtttcctg gccggtgctt taagttatcc tcctttactc ctccacgtca taggcgtttt 
241 tcttctctct cgatcagaaa catttcgcat gaatccgecg atcagacttc ttcttctagg 
301 ccgcgaactc tttatcctgg tggttacaag cgtcccgaac tcgccgttcc cggtttactt 
361 ctccggctag acgccgacga ggttatgagc gggaatcgtg aagagactct tgatttggtc 
421 gaccgtgctt tagctaaatc ggttcaaatc gtcgtgattg atggcggagc caccgctggt 
481 aagctctacg aggcggcttg tttgctgaaa tcacttgtca aaggccgtgc ttacctcttg 
541 atcgctgaac gtgttgatat cgcctccgcc gttggtgcta gtggtgttgc tctctccgac 
601 gaaggtcttc cggcgattgt ggcgagaaac acattgatgg gatccaaccc cgactcggta 
661 cttcttccac tggtagctcg gattgtgaag gatgttgatt ctgctctaat tgcctcaagc 
721 tccgagggtg ctgatttcct tatacttgga tctggtgaag aagatacgca agtggcggat 
781 tctttgttga agagcgtgaa aataccgata tatgtgactt gcagaggcaa tgaagaagct 
841 ■ aaagaagaat tgcagttact gaaatcaggt gtttctggtt ttgttatttc gttgaaagat 
901 ttgcgttctt ctagggatgt agctcttcgc cagagtcttg atggagctta tgttgtaaat 
961 aatcatgaga cacaaaatat gaatgaactg ccggagaaaa agaattctgc tggcttcata 
1021 aaattagagg acaaacagaa actaatagta gaaatggaga aatctgtgtt gagagagacg 
1081 attgaaatca tccacaaggc ggctccactg atggaggaag tctcccttct aattgatgct 
1141 gtttctcgga tcgatgagcc gtttctgatg gttatagtgg gggaatttaa ctctggaaaa 
1201 tcaacggtta tcaatgcact tcttgggaag agatacctga aagaaggggt agtccccact 
1261 accaatgaaa tcacgtttct gtgctactct gacttggaat ccgaagagca acaacgttgc 
1321 caaacacatc cagatggcca atatataaat attgttgaca cacctgggac caatgtgatc 
1381 cttcaaaggc aacagcgtct tacagaagaa tttgttccac gtgcagattt gcttgttttt 
1441 gttctttctg ctgaccgccc tttaactgaa agtgaggtag aagttaccgt tttacttggc 
1501 atggaaggga aagttgtcac taggttgaat gcatatatca aggttgcgtt tctccggtac 
1561 acacagcagt ggaaaaagaa atttgtgttt attctgaata aatctgatat ctatcgtgat 
1621 gctcgtgagc ttgaggaagc tatttcattt gttaaagaga atacacggaa gttgcttaat 
1681 acagaaaatg tgatattgta tccggtgtcc gcacggtctg ctcttgaggc gaagctttca 
1741 acagcttctt tggttggcag agatgatctt gagatcgcag atcctggttc taattggaga 
1801 gtccagagct tcaatgaact tgagaaattt ctttatagct tcttggatag ctcaacagct 
1861 accgggatgg agagaataag gcttaaattg gagacaccca tggcgattgc tgagcgtctc 
1921 ctttcttctg tggaagctct tgtgagacaa gattgcctag ctgctaggga agacttggct 
1981 tcagcagaca agattatcag tcgaactaaa gaatacgcgc ttaagatgga atatgagagc 
2041 atttcttgga gaaggcaggc tctctcgttg attgataatg ccagattaca agttgttgat 
2101 ctgataggaa ctaccctgcg actatcaagc cttgatcttg cgatctcgta cgtgttcaaa 
2161 ggggaaaaat cggcctcagt agcagctaca tccaaagttc aaggtgaaat actcgctcca 
2221 gcactcacaa atgcgaaaga attgcttgga aaatatgctg aatggctaca atcaaatact 
2281 • gcccgtgaag ggagtctgtc tctgaaatca ttcgaaaaca aatggccaac atatgtcaat 
2341 tcaaaaacgc aattgggcat agacacatat gacttacttc agaaaactga taaagtcagc 
2401 ttgaaaacaa tacagaactt gagtgccgga accacatcaa aacgattgga acaagatatt 
2461 cgagaagtg 
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SEQ ID NO:24 



MEALEHLVLWDGP IGYLRDKKTNQKHRRRYLTLSLS LSMRTL I SHRQCVTS PFLI SAAS PPFPGRCFKLS S 
FTPPRHRRFSSLSIRNISHESADQTSSSRPRTLYPGGYKRPELAVPGLLLRLDADEVMSGNREETLDLVDR 
ALAKSVQIWIDGGATAGKLYEAACLLKSLVKGRAYLLIAERVDIASAVGASGVALSDEGLPAIVARNTLM 
GSNPDS VLLPLVAR I VKDVDS AL I AS S SEGADFL I LGSGEEDTQVADSLLKS VKI P I YVTCRGNEEAKEEL 
QLLKSGVSGFVI S LKDLRS SRDVALRQSLDGAYWNNHETQNMMELPEKKNSAGF I KLEDKQKL I VEMEKS 
VLRETIEIIHKAAPLMEEVSLLIDAVSRIDEPFLMVIVGEFNSGKSTVINALLGKRYLKEGWPTTNEITF 
LCYSDLESEEQQRCQTHPDGQYINIVDTPGTNVILQRQQRLTEEFVPRADLLVFVLSADRPLTESEVEVTV 
LLGMEGKWTRLNAY I KVAFLRYTQQWKKKFVF I LNKSD I YRDARELEEAI S FVKENTRKLLNTENVIL YP 
VSARSALEAKLSTASLVGRDDLEIADPGSNWRVQSFNELEKFLYSFLDSSTATGMERIRLKLETPMAIAER 
LLSSVEALVRQDCLAAREDLASADKIISRTKEYALKMEYESISWRRQALSLIDNARLQWDLIGTTLRLSS 
LDLAISYVFKGEKSASVAATSKVQGEILAPALTNAKELLGKYAEWLQSNTAREGSLSLKSFENKWPTYVNS 
KTQLGIDTYDLLQKTDKVSLKTIQNLSAGTTSKRLEQDIREV 
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SEQ ID NO:25 



69061 
69121 
69181 
69241 
69301 
69361 
69421 
69481 
69541 
69601 
69661 
69721 
69781 
69841 
69901 
69961 
70021 
70081 
70141 

70201 

70261 
70321 
70381 
70441 
70501 
70561 
70621 
70681 
70741 
70801 
70861 
70921 
70981 
71041 



acaaagacca 
aatctgtacc 
caatcgctgt 
agaacaagag 
ccacttctgg 
tttgttttcg 
tctttagcaa 
agattttctc 
ttgtttttaa 
tgactttgtt 
ctaacgttaa 
ctggattttt 
ctttagaaat 
ttgttttttt 
gtttagatgc 
ctaggagacc 
atatgaggta 
cttccaactc 
ataccaaagt 
tttgaattcg 
ttattcatta 
acccgacatt 
caatatcaac 
ctaaaagacg 
caaagttaac 
gttgagccaa 
ggaagtttgc 
tcttaaaagt 
gaaggtggag 
agcggagcaa 
aagtgatgca 
gcaagcaaaa 
aagtatctgg 
aacagatgac 



gttaaaaacg 
ttattttctc 
gtttccgtgt 
aaacctggaa 
tccccaactg 
aaattttcgc 
caaaggaagg 
tttccctaac 
taaaataaaa 
tagaaaactt 
ttctaaacaa 
aaaacttctc 
cacagctagc 
ccctcatatt 
tctatgcgga 
ataaaatcca 
catgtgcagg 
atccagcgat 
ttcacgccca 
cattcaaaaa 
tgacaaagta 
gtgttggaat 
ttgtagcaac 
atcaagtctt 
tagattactc 
cgcatcagcc 
tatagccaca 
cataaatatg 
taatgagaca 
aggccaagag 
gacagtcctg 
ctcaactatg 
gaaaatacgg 
acaattaaac 



tgtgtagtat 
tctctctagt 
gttttccecc 
acaagaacca 
aaaaggacaa 
aatttttaat 
tggaatctgt 
ttttgaccat 
ggtttggtta 
aggaggacca 
tctccagtat 
gaaccaatcc 
atatgctgag 
taagccacta 
gaatcaattt 
attgacagaa 
tcaaagatca 
ttgtatcaca 
taatgctatt 
aaccatcagt 
catacacttg 
agctaaagtc 
tgtaatttac 
agctgagctt 
gtcgcatctg 
actttattca 
tacctgtcac 
cctctagttg 
ttgggaagag 
caagaagatc 
cagctccaag 
tcaaaagtgt 
tactgaatta 
aatgatgaag 



aacttactgg 
gagccctgac 
tttttggttt 
aaaaaagtgg 
tccaaagcta 
attattttgg 
ttcacgttta 
acagtatggt 
tcaagcatat 
tatggcaagc 
caagcattaa 
ttaactaaaa 
aattactctc 
aagtcaaaag 
catatgaatg 
aaaatgagtc 
gaagaaaatt 
aacaatctga 
gtttggttct 
gagtccattt 
ccccccactg 
tcatctcgtc 
ttctaatatc 
cttctcgata 
aaagatcttt 
ccttaccaat 
atagattatg 
caagaaaaaa 
gggaaattta 
ttccagtgtg 
cccaccaact 
cacttagatt 
acatctccgt 
atcaagacac 



taagtaaagc 
catccgaatt 
tagatttgcc 
gctttctctg 
gatcccttca 
aagtctatgt 
cacaaaaaca 
ccatacttaa 
atgtcattag 
ttttatacag 
caaggtttat 
aagaaattca 
catggaaact 
attagtacat 
tatcaagcaa 
aactaacata 
ttctccatga 
aaaagaagct 
ttcaagaacc 
caagtcggaa 
aacaatgtca 
tcgtgataca 
tgataattct 
aggcttggca 
ttgcatagcg 
tatagcctgt 
ttatgcatac 
atacactagg 
gagcagtgtt 
gtcggtagca 
gtcacaaaga 
gattcttgaa 
cagatcatag 
tttaatcgac 



tataagcaag 
tcgcattcgc 
taaaccaatc 
catcatcatt 
aattttcctt 
ttctttctga 
tgtcaactgg 
tattctctct 
cttaaagcta 
tgttagactt 
tctagcacct 
agcgttttat 
tatactaaga 
tgacaactaa 
ttcatgaact 
tttacctgtg 
gtctcttgag 
aaaaaacgtt 
tccccaatct 
ctggcaggta 
agaagggaaa 
tgaaggttat 
ttctggattc 
acaatattca 
tcttcgagct 
cttcgatatg 
aaccagtctt 
cgtgatctaa 
attaccctcc 
ctgaggttag 
actagaaaag 
tagcgagacg 
gttcggattg 
tgaattc 



Fig. 24 
AtFzo-like Genomic Sequence 

From 

F15K9, AC005278: 
F10O3, AC006550: 



69001 aaaaactttt caaaacttca tgtgttgtga aaacaaaagt tttttggtaa tgaaaactcg 
69061 acaaagacca gttaaaaacg tgtgtagtat aacttactgg taagtaaagc tataagcaag 
69121 aatctgtacc ttattttctc tctctctagt gagccctgac catccgaatt tcgcattcgc 

69181 caatcgctgt gtttccgtgt gttttccccc tttttggttt tagatttgcc taaaccaatc 
69241 agaacaagag aaacctggaa acaagaacca aaaaaagtgg gctttctctg catcatcatt 
69301 ccacttctgg tccccaactg aaaaggacaa tccaaagcta gatcccttca aattttcctt 
69361 tttgttttcg aaattttcgc aatttttaat attattttgg aagtctatgt ttctttctga 
69421 tctttagcaa caaaggaagg tggaatctgt ttcacgttta cacaaaaaca tgtcaactgg 
69481 agattttctc tttccctaac ttttgaccat acagtatggt ccatacttaa tattctctct 
69541 ttgtttttaa taaaataaaa ggtttggtta tcaagcatat atgtcattag cttaaagcta 
6 9601 tgactttgtt tagaaaactt aggaggacca tatggcaagc ttttatacag tgttagactt 
69661 ctaacgttaa ttctaaacaa tctccagtat caagcattaa caaggtttat tctagcacct 
69721 ctggattttt aaaacttctc gaaccaatcc ttaactaaaa aagaaattca agcgttttat 
69781 ctttagaaat cacagctagc atatgctgag aattactctc catggaaact tatactaaga 
69841 ttgttttttt ccctcatatt taagccacta aagtcaaaag attagtacat tgacaactaa 
69901 gtttagatgc tctatgcgga gaatcaattt catatgaatg tatcaagcaa ttcatgaact 

69961 ctaggagacc ataaaatcca attgacagaa aaaatgagtc aactaacata tttacctgtg 
70021 atatgaggta catgtgcagg tcaaagatca gaagaaaatt ttctccatga gtctcttgag 
70081 cttccaactc atccagcgat ttgtatcaca aacaatctga aaaagaagct aaaaaacgtt 
70141 ataccaaagt ttcacgccca taatgctatt gtttggttct ttcaagaacc tccccaatct 
70201 tttgaattcg cattcaaaaa aaccatcagt gagtccattt caagtcggaa ctggcaggta 
70261 ttattcatta tgacaaagta catacacttg ccccccactg aacaatgtca agaagggaaa 
70321 acccgacatt gtgttggaat agctaaagtc tcatctcgtc tcgtgataca tgaaggttat 

70381 caatatcaac ttgtagcaac tgtaatttac ttctaatatc tgataattct ttctggattc 

70441 ctaaaagacg atcaagtctt agctgagctt cttctcgata aggcttggca acaatattca 

70501 caaagttaac tagattactc gtcgcatctg aaagatcttt ttgcatagcg tcttcgagct 

70561 gttgagccaa cgcatcagcc actttattca ccttaccaat tatagcctgt cttcgatatg 

70621 ggaagtttgc tatagccaca tacctgtcac atagattatg ttatgcatac aaccagtctt 

70681 tcttaaaagt cataaatatg cctctagttg caagaaaaaa atacactagg cgtgatctaa 

70741 gaaggtggag taatgagaca ttgggaagag gggaaattta gagcagtgtt attaccctcc 

70 8 01 agcggagcaa aggccaagag caagaagatc ttccagtgtg gtcggtagca ctgaggttag 

70861 aagtgatgca gacagtcctg cagctccaag cccaccaact gtcacaaaga actagaaaag 

70 921 gcaagcaaaa ctcaactatg tcaaaagtgt cacttagatt gattcttgaa tagcgagacg 

70981 aagtatctgg gaaaatacgg tactgaatta acatctccgt cagatcatag gttcggattg 

71041 aacagatgac acaattaaac aatgatgaag atcaagacac tttaatcgac tgaattc 

tagttagaac ttagactaaa agtatttaat acttgaagct 

241 caccacttct cgaatatctt gttccaatcg ttttgatgtg gttccggcac tcaagttctg 

301 tattgttttc aagctgactt tatcagtttt ctgaagtaag tcatatgtgt ctatgcccaa 

361 ttgcgttttt gaattgacat atgttggcca tttgttttcg aatgatttca gagacagact 

421 cccttcacgg gcagtatttg attgtagcca ttcagcatat tttccaagca attcctgcaa 

4 81 acagtgaaat gtaaagtcaa tcaggtcaca acaagacatt gttagacaat atttactttc 

541 tgcatgaata gtgactatat ctcagacctc atatatatga ccaacatgtc cccagttagg 

601 ccaatactca aagaataaag catcacactt actttcgcat ttgtgagtgc tggagcgagt 

661 atttcacctt gaactttgga tgtagctgct actgaggccg atttttcccc tttgaacacg 

721 tacgagatcg caagatcaag gcttgatagt cgcagggtag ttcctatcag atcaacaact 

781 tgtaatctgg cattatcaat ctgcgataaa agccagagta caaaacacaa aaaagccaag 

841 ttagaacaat ccaatttcct ccttcgtgat tcaacaagat aatatctaat agaatttata 
901 ccaacgagag agcctgcctt ctccaagaaa tgctctcata ttccatctta agcgcgtatt 
961 ctttagttcg actgataatc ttgtctgctg aagccaagtc ttccctagca gctaggcaat 
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1021 cttgtctcac aagagcttcc acagaagaaa ggagacgctc agcaatcgcc atgggtgtct 
1081 ccaatttaag ccttattctc tccatcccgg tagctgttga gctatccaag aagctataaa 
1141 gaaatttctc aagttcattg aagctctgga ctctccaatt agaaccagga tctgcgatct 
1201 caagatcatc tctgccaacc aaagaagctg ttgaaagctt cgcctcaaga gcagaccgtg 
1261 cggacaccgg atacaatatc acattttctg tattaagcaa cttccgtgta ttctctttaa 
1321 caaatgaaat agcttcctca agctgcagca gtaaaattaa tctttagtca agaaaaatcc 
1381 agcaattaca gagactatca aggaaaagac ataaatattg tttctgataa acctcacgag 
1441 catcacgata gatatcagat ttattcagaa taaacacaaa tttctttttc cactgctgtg 
1501 tgtaccggag aaacgcaacc tgacaaacgc aatgagatgt aaatcagcct acagtaaaat 
1561 caagacagca ggaagctcca gaggtagaga tagaaatgac atgggtatga tgacattgag 
1621 aagcttattc cttcttttga aactttttca gattttcaaa ataagaacca aatattcata 

1681 aggttagtac ttcgtagatg atcatttaca tacaatttgc catggacgta aaatcagttt 
1741 tgctgtacca ataacaactc acctctcaaa tgtattggcc taagctttac caaaatatgg 
1801 accaacagga catactggtt ccagcaatag ctataacact cattactttg aaaataaatt 

1861 agtctagttt aaatttataa tcaacaaaac cttgatatat gcattcaacc tagtgacaac 
1921 tttcccttcc ctacacaagg tcttgatggt ctatacatga cttaactaac tactttcgtg 
1981 aaaaatagat ctaacaagct acttaggcag atacatattg agcaaaaaca acaacaacta 
2041 acatgccaag taaaacggta acttctacct cactttcagt taaagggcgg tcagcagaaa 
2101 gaacaaaaac aagcaaatct gcacgtggaa caaattcttc tgtaagacgc tgttgccttt 
2161 gaaggatcac attggtccca ggtgtgtcaa caatatttat ctgctcgaaa ggagaacaac 
2221 ccaaatcaat tcctccaaat aattagtaaa aattcaggac tgcgatggta gaattttgaa 
2281 ttactcacat ccttaagtat tggtgcagga agatagcata catattggcc atctggatgt 
2341 gtttggcaac gttgttgctc ttcggattcc aagtcagagt agcacagaaa cgtgatttca 
2401 ttggtagtgg ggactacccc ttctttcagg tatctcttcc caagaagtgc attgataacc 
2461 gttgattttc cagagttaaa ttccccctag taccagcaaa ttgtggattc caattaatta 
2521 tccatacaaa aactattttg aatcataaag gcaacttgtc ccacttaaaa catcttccaa 
2 581 gaacggcata ctcacaagtc acaagaaagc gaaaaaacta tgtacacaag cagatggtac 
2641 agttaagaaa tatgcagttc ccttacaacc agatccataa gcaaattaag aagagaagca 
2 701 tagacagaac aaaagaagaa gcgaaacatc atagtcatag gcgacgaaag aagatattga 
2761 gaagcacfeaa aggacaagtg actagaagca tgcccattgt atctcaagct atctagtgct 
2821 ggcaaaagtg tgtaactaaa atgtttctgg aatgaatata gagctaatac caatgcaaat 

2 8 81 tcctggaatc acaatttgac ggagttgagt gcagaattac cactataacc atcagaaacg 
2941 gctcatcgat ccgagaaaca gcatcaatta gaagggagac ttcctccatc tgttaatgaa 

3 001 aaatagtcga agaagtaaat gaaaccatat tataaccaca caaaaaccta cttagaacca 
3061 tttcaaaaaa ttgacttcaa ctaccaaatg tttgaaataa aaatcaccag tggagccgcc 
3121 ttgtggatga tttcaatcgt ctctctcaac acagatttct ccatttctac tattagtttc 
3181 tgtttgtcct ctaattttat gaagccagca gaattctttt tctccggcag ttcattcata 
3241 ttttgtgtct catgattatt tacaacataa gctccatcaa gactctggcg aagagctaca 
3301 tccctagaag aacgcaaatc tttcaacgaa ataacaaaac cagaaacacc tgatttcagt 
3361 aactgcaatt cttctttagc ttcttcattg cctctgcaag tcacatatat cggtattttc 
3421 acgctcttca acaaagaatc cgccacttgc gtatcttctt caccagatcc aagtataagg 
34 81 aaatcagcac cctcggagct tgaggcaatt agagcagaat caacatcctt cacaatccga 
3 541 gctaccagtg gaagaagtac cgagtcgggg ttggatccca tcaatgtgtt tctcgccaca 
3601 atcgccggaa gacctgtaca tagcattgag agacaaaact cactctatga gaaattaaat 
3661 gctaaaactg aatgaaatca gttgttacct tcgtcggaga gagcaacacc actagcacca 
3721 acggcggagg cgatatcaac acgttcagcg atcaagaggt aagcacggcc tttgacaagt 
3 781 gatttcagca aacaagccgc ctcgtagagc ttaccagcgg tggctccgcc atcaatcacg 
3841 acgatttgaa ccgatttagc taaagcacgg tcgaccaaat caagagtctc ttcacgattc 
3 901 ccgctcataa cctcgtcggc gtctagccgg agaagtaaac cgggaacggc gagttcggga 

3 961 cgcttgtaac caccaggata aagagttcgc ggcctagaag aagaagtctg atcggcggat 

4 021 tcatgcgaaa tgtttctgat cgagagagaa gaaaaacgcc tatgacgtgg aggagtaaag 
4 0 81 gaggataact taaagcaccg gccaggaaac ggtggagatg cggcggagat aagaaacggt 
4141 gacgtcacac attgccggtg agagattaga gttctcatgg agagagaaag agagagagtg 
4201 agataccttc gtctgtgttt ctgattagtt ttcttatctc ttaaatatcc tattggccca 
4261 tcccaaagct ataagacgga cggcgcgatt aattactttt caaacacatg aattaacgtt 
4321 ttcacatatg tgttcatatc caaaaggtcc aaagtatacc acgaaaaggg agaaaaacag 
4381 atttaaattc gtgaaatccc tctcccacaa ttaaatttac ttcttccaaa caaagacaaa 
4441 cggcttgaac cagtcaagta agtgatacgg caccactaga tgttccagag cctccatctt 
4501 ttttaatacg aagaagattt gtcctttgtg tgtatgaatt taacaagttt taattataga 
4561 tttgtgtgtg tatgaattta aaaacctagt acgtagcatc agggaatgat atcatagcta 
4621 ttttagttga gctttcaaat aagagatgat caaaatttag aacttctaag aacatgaacg 
4681 aataaacaac tattttcttt tcaaaccaac taaggtagat ggtcactgaa agtatataca 
4741 tcagataaaa gttgcttgtt attccagatg aagttggacc gagaaaaaaa aaagttactt 
4801 gttattcaat atgtttggat ctttgtcttg cagattgcta tatagggttg ataatgggct 
4861 tcgttgtaat gggtatacag tgtataagaa tcggccttgt gcaaccaatc ctaatatgtg 
4 921 tgtctcatta aggtaagtgc ttaagattag aagagtaaaa cacttgactt atcaactatg 
4 981 tcaactaagg gttctatatt tttattaaat aaaaaataat tgaatatttt ttagaatgat 
5041 ttaataaatt taatgctatt gtttgattta aatgtataat' tcaccgcgag aagaaatttt 
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5101 ataactcaaa ttttaaagtt ttaagttgta 
5161 tataattgta ttttgattgt tgtttctcgg 
5221 atatcgaatc aaacccgtca attctaaaat 
5281 tatattttaa ttgtcattct aagatttcac 
5341 caaaccagtc aattctaaaa tatcacccgt 
54 01 agcccgtcaa ttctaggatt tcacccgtgg 
5461 tgtcattcta ggatttcact cctaattcta 
5521 attctaaaat atcacccgta gtacaccatc 
5581 tctaggattt cgctcgtggt agtatttaat 
5641 ctcctagttc tatcgcaaat tcttatcaac 
5701 agtataaagt ttaaatattt ataatattta 
5761 taaaaaaatt aaagttttaa gttttttttt 
5821 agattatata attacattat gattgtcatt 
5881 tgttattatC CACTCAAACC TGTCACACCA 
5941 atttgtcatt ttattataaa tttcaaatat 
6001 atattgaccc aaacttcatc attgaatttt 
6061 atatcgtccc gtattaatat cttttatatt 
6121 attttttaaa ctttttaaag tttcaatttt 
6181 ttttaattta aagataaact ttataaaaag 
6241 aaagttataa tatttataat ttcttgaaac 
6301 aatccgagta aaatcagata actattttaa 
6361 ctcattcgta atcagaatca ttttggtcct 
6421 tttttctaag cgatgtggga cattgtacac 
6481 tgtccgttta aaaaactttg aattacatca 
6541 ctatatatat tttatataaa ttcaaaataa 



tttgtttatt ttgttaaatg tttaatattg 
atttcacccg tagtacatca tcccatatta 
ttcacccgtg gtagtattta attgtataat 
tcctaattct atcgcaaatt attatcaacc 
agtacaccat cccatattaa tatcgaatca 
tagtatttaa ttgtataatt atattttaat 
tcgcaaatta ttatcaaccc aaaccagtca 
ccatattaat atcgattcaa actcgtcaat 
tgtataatta tattttaatt gtcattttaa 
ccaaacagtc aattctaaaa tttcacccgt 
aatttcttat aaaagaatca aaatgtgttt 
tttaatattg ttaattttgt ttagtgttta 
atatgttttt ctccatagca tactatccca 
TATAACcccg tcccgtgaaa ttaaacacaa 
ttataaaatt agaaacttca aaaaagatta 
gagtgttata tctaagattt ctctcgcaat 
gtttaaattt cttgtaaaat ttaatttata 
ttaaaataaa taaccctagg aaacaaacca 
tttttaaaat tataatattt aacttttgat 
attttaaagt ttcaattctt taaaataata 
ttttggacgc ttgataaatc aagcttcctg 
tttataatat gggtctgaac cattgtccaa 
atattatttc ttcataggtt gaataatata 
tattcagaaa aaaatataat attttattaa 
ataaagtata agatcaaata aaaatgaaag 
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ARC5 Homologous Sequences 



<A class=dblinks 

href="javascript:PopUpMenu2_Set(Menu22246438,",",",",");" 

target=_self>Links</A></SPAN></TDx/TR></TBODY></TABLE></DT></DL><PRE>L 
OCUS BQ860973 

712 bp mRNA linear EST 14-AUG-2002 

DEFINITION QGC17C24.yg.abl QG ABCDI lettuce salinas Lactuca sativa cDNA clone 

QGC 1 7C24, mRNA sequence. 
ACCESSION BQ860973 
VERSION BQ860973.1 GL22246438 
KEYWORDS EST. 
SOURCE Lactuca sativa 

ORGANISM <A href="http://www.ncbi.nlm.nih.gov/htbin- 
post/Taxonomy/wgetorg?name=Lactuca+sativa 
">Lactuca sativa</A> 

Eukaryota-; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta- 

Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; 

asterids; campanulids; Asterales; Asteraceae; Cichorioideae; 

Cichorieae; Lactuca. 
REFERENCE 1 (bases 1 to 712) 
AUTHORS Kozik,A., Michelmore,R.W., Knapp,S., Matvienko,M., Rieseberg,L., 

Lin,H., van Damme,M., Lavelle,D., Chevalier,P., ZiegleJ., Ellison 

,P., KolkmanJ., Slabaugh,M.S., Livingston,K., Zhou,Y., Lai,Z., 

Church,S., Jackson,L. and Bradford,K. 
TITLE Lettuce and Sunflower ESTs from the Compositae Genome Project 

<A 

href="http://compgenomics.ucdavis.edu/">http://compgenomics.ucdavis.edu/</A>/ 

JOURNAL Unpublished 
COMMENT Contact: Alexander Kozik [R.W.Michelmore] 

Department of Vegetable Crops, R.W.Michelmore Lab 

University of California at Davis (UCD) 

Asmundson Hall, UCD, Davis, CA 95616, USA 

Tel: l-(530)-742-1742 

Fax: l-(530)-752-9659 

Email: <A href="mailto:akozik@atgc.org">akozik@atgc.org</A> <A href- ' 
mailto:[michelmore@vegmail.ucdavis.edu">[michelmore@vegmail.ucdavis.edu</A>] 

singleton, see <A href="http://cgpdb.ucdavis.edu/">http://cgpdb.ucdavis.edu/</A>/ 

for 

details. 

Plate:QGC17 row: C column: 24. 
FEATURES Location/Qualifiers . 
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source 1 ..712 

/organism- 'Lactuca sativa" 
/molJype="mRNA M 
/cultivar="Salinas" 
/db_xref="<Ahref=" 
http://ww.ncbi.nlm.nih.gov/htbin-post/Taxonomy/wgetorg7id-42 
/clone="QGC17C24" 
/lab_host="E.coli M 

/clone_lib= M QG_ABCDI lettuce salinas" 

/note- 'Vector: pBRcDNASfiAB; The library was constructed 

from 10 different sources of RNA from a single genotype. 

Separate cDNAs were generated using primers that 

incorporated unique 5' and 3' tags to distinguish each 

source of RNA. cDNAs were then pooled, size- fractionated, 

directionally cloned into a custom medium-copy vector and 

transformations made with four size classes to minimize 

size bias. Details of each source of RNA and library 

construction can be obtained at <A href- 'http://cgpdb.ucdavis.edu/"> 

http : //cgpdb . ucdavi s . edu/ </ A>/ 

TAG_LIB=QG_ABCDI lettuce salinas 
TAG_TISSUE=chemical induction 
TAG_SEQ=TGTAGCCGGG" 

BASE COUNT 206 a 152 c 142 g 210 1 2 others 

ORIGIN 

1 ttgttcagct ccgccaaaag aatccaagaa ttggcgtaat ccggctcgat tcttattgtg 
61 aagggaccag gtgacataac gggtggtgct tattagatct tccatgcatt tttcatggca 
121 tgatctttcg gtggattcag caaagttata gaaagcagat gaaacacgtc tcaagaaaac 
1 8 1 ttcatggcca cttaggaatt cgccttcttt ctgaagaaga taaacggaga tgggaagtaa 
241 tctcttgaga atgtgaagaa gtcgactgcc caactgatga agaaaaggtt caaaagtatc 
301 acgagctttt gcaacagcga tgacacatgc agtcctggag taatttgttc catcatgaat 
361 atcttcgacc ccacatgcat tcacaatttc ttcacgtgta attgcagggc attttatccc 
421 tccaacaaca aacctaaatt cagccatggc acgatgatat tgtgcacctc catatagacg 
481 catacctgca ttaggtatta gtttgtgtgg gaactgagag ccatcaatac cgattaatgc 
541 ccctccatta accctctcat cttgtagtgt ttccccaaat ttatctggag gtgcaacaac 
601 tgtccctntt catagcagtg ataacttggt aaggaaaaga tcatgaaaag atctcncttt 
661 ctcctttagt ttgacttcat ctaaagtgct gagttcttga tttatgtcat tt 

// 

</PRE> 
<DL> 
<DT> 

<TABLE cellSpacing=0 cellPadding=0 width=" 1 00%"> 
<TBODY> 
<TR> 

<TD><INPUT type=checkbox value=13371 1 19 name=uid><B>2: </B>BG452325. 

NP086D06LF1F1047 ...[gi:13371 119] </TD> 
<TD align=right><SPAN> 
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<SCRIPT language=JavaScriptl.2> 



<!-- 



varMenul3371119 = [ 

["Taxonomy", "window.top.location='/entrez/query.fcgi?db=nucleotide&cmd=Display&dopt= 
nucleotide Jaxonomy&from_uid=l 3371 1 19'","",""], 

["Help","window.open('/entrez/query/static/popup.htmr, l Links_Help','resizable=no,scrollbars 

=yes,toolbar^o,location^o,directories^o,status^o,menubar=no,copyhistory==no,width=40 

0,height=500');","",""] 

] 

//--> 

</SCRIPT> 

<A class=dblinks 

hre^"javascript:PopUpMenu2_Set(Menul3371 1 19,",",",",");" 

target=_self>Links</Ax/SPAN></TD></TR></TBODY></TABLEx/DT></DL><PRE>L 
OCUS BG452325 

666 bp mRNA linear EST 16-MAR-2001 

DEFINITION NF086D06LF1F1047 Developing leaf Medicago truncatula cDNA clone 

-NF086D06LF 5', mRNA sequence 

ACCESSION BG452325 

VERSION BG452325.1 GI:13371119 

KEYWORDS EST. 

SOURCE Medicago truncatula (barrel medic) 
ORGANISM <Ahref=" 
http://www.ncbi.nlm.nih.gov/htbin- 

post/Taxonomy/wgetorg?name=Medicago+truncatula">Medicago 
truncatula</A> 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 

Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids 

; eurosids I; Fabales; Fabaceae; Papilionoideae; Trifolieae; 

Medicago. 
REFERENCE 1 (bases 1 to 666) 
AUTHORS Torres-Jerez,I., Scott,A.D., Harris,A.R, Gonzales.RA., Bell,C.J., 

Flores,H.R, InmanJ.T., WellerJ.W. and May,G.D. 
TITLE Expressed Sequence Tags from the Samuel Roberts Noble Foundation 

Medicago truncatula leaf library 
JOURNAL Unpublished 
COMMENT Contact: May GD 

Plant Biology Division 

The Samuel Roberts Noble Foundation 

2510 Sam Noble Parkway, Ardmore, OK 73402, USA 

Tel: 580 221 7391 

Fax: 580 221 7380 

Email: <A href="mailto:gdmay@noble.org">gdmay@noble.org</A> 
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Insert Length: 666 Std Error: 0.00 
Plate: 086 row: D column: 06 
Seq primer: TC AC AC AGG AAAC AGCT ATG AC . 
FEATURES Location/Qualifiers 
source 1..666 

/organism-'Medicago truncatula" 
/molJype-'mRNA" 
/db_xref="<A href=" 

http://www.ncbi. nlm.nih.gov/htbin-post/Taxonomy/wgetorg?id=3880 M >taxon:3880</A> M 
/clone="NF086D06LF" 
/tissuetype="leaf' 
/devstage-'Pooled developmental" 
/clone_lib="Developing leaf 

/note= M Vector: Lambda Zap; Contains a mixture of very 

young, developing, mature and senescing leaves." 
BASE COUNT 201a 163 c 147 g 154 1 1 others 
ORIGIN 

1 atctaaagta acaaccacca caaaacacaa caatggagga agaaagagaa caccaccaac 
61 tcaaagacaa agaagaaaac gagtggcgtc tctacgaagc ttacaatgaa cttcacgcgc 
121 ttgctcaaga acttcacacg cctttcgacg cgccggcggt actggttgtg ggccaccaaa 
181 cagacgggaa gagcgcctta gttgaggctc taatgggctt ccagttcaac cacgtcggtg 
241 gtggcaccaa aacccgccgg cccattactc ttcacatgaa atatggccca cattgcgagt 
301 ctccttcttg ctatcttctt tctgatgatg acccttctct ttctcaccat atgtcacttt 
361 cccaaatcca gggttatatt gaagctgaga atgcgaggtt ggagcgtgac tcatgttgtc 
421 aattttcagc taaggaaata atcataaaag tggaatacaa atactgtccc aatctcacca 
481 taatagacac accaggatta gttgctcctg caccaggtcg taaaaatagg gcgatacagg 
541 cacaggcacg agcggtagag tcactcgttc gtgcaaaaat gcagcacaag gagttcatta 
601 tactctgtct tgaagattgt agtgattgga gcaatgcgac tacgangcgc gttgtaatgc 
661 aaattg 

// 

</PRE> 
<DL> 
<DT> 

<TABLE cellSpacing-0 cellPadding=0 width="100%"> 
<TBODY> 
<TR> 

<TD><INPUT type^checkbox value=14878353 name=uid><B>3: </B>BI270606. 

NF056G04FL1F1036 ...[gi:14878353] </TD> 
<TD align=right><SPAN> 

<SCRIPT language-JavaScriptl.2> 

<!- 

varMenul4878353 -[ 

["Taxonomy", M window.top.location=Ventrez/query.fcgi?db=^ucleotide&cmd=Display&dopt= 
nucleotide Jaxonomy&from_uid=14878353" , , M ",""], 

/5f 
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[''Help , \'Vindow.open(7entrez/query/static/popup.htmlVLinks_HelpVresizable=no,scrollbars 

=yes,toolbar^o4ocation^o4irectories^o,status^o,menubar^o,copyhistory=no,width=40 

0,height=500 , ); , 7"',""] 

] 

//--> 

</SCRIPT> 

<A class=dblinks 

href="javascript:PopUpMenu2_Set(Menul4878353,",",",",");" 

target=_self>Links</A></SPAN></TD></TR></TBODY></TABLE></DT></DL><PRE>L 
OCUS BI270606 

663 bp mRNA linear EST 18-JUL-2001 

DEFINITION NF056GQ4FL1F1036 Developing flower Medicago truncatula cDNA clone 

NF056G04FL 5', mRNA sequence. 
ACCESSION BI270606 
VERSION BI270606.1 GI: 14878353 
KEYWORDS EST. 

SOURCE Medicago truncatula (barrel medic) 
ORGANISM <Ahref=" 
http://www.ncbi.nlm.nih.gov/htbin- 

post/Taxonomy/wgetorg?name=Medicago+truncatula">Medicago 
truncatula</A> 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 

Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids 

; eurosids I; Fabales; Fabaceae; Papilionoideae; Trifolieae; 

Medicago. 
REFERENCE 1 (bases 1 to 663) 
AUTHORS Torres- Jerez,L, Scott,A.D., Harris,A.R, Gonzales,R.A, Bell,CJ., 

Flores,H.R, Inman,J.T., WellerJ.W. and May,G.D. 
TITLE Expressed Sequence Tags from the Samuel Roberts Noble Foundation 

Medicago truncatula flower library 
JOURNAL Unpublished 
COMMENT Contact: May GD 

Plant Biology Division 

The Samuel Roberts Noble Foundation 

2510 Sam Noble Parkway, Ardmore, OK 73402, USA 

Tel: 580 221 7391 

Fax: 580 221 7380 

Email: <A href="mailto:gdmay@noble.org">gdmay@noble.org</A> 
Insert Length: 663 Std Error: 0.00 
Plate: 056 row: G column: 04 
Seq primer: TC AC AC AGGAAAC AGCT ATG AC . 
FEATURES Location/Qualifiers 
source 1..663 

/organism-'Medicago truncatula" , 
/mol_type="mRNA" 
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/db_xref="<A href=" 
http://www.ncbi.nlm.nih.gov/htbin-postf^ 
/clone="NF056G04FL" 
/tissuetype- 'Developing flowers" 

/devstage- 'Developmental^ pooled. Contains a mixture of 
very young, developing, fully-opened flowers and flowers 
in early transition into pods." 
/clone_lib- 'Developing flower" 

/note- ' Vector: Lambda Zap; cDNA was prepared from polyA+ 
enriched, pooled samples of equivalent amounts of total 
RNA from very young, developing, fully-opened flowers and 
flowers transitioning into pods. The cDNA was 
directionally ligated into the Uni-Zap XR vector 
(Stratagene) and packaged using the Gigapack III Gold 
packaging extracts. Phagemids containing cDNA inserts were 
in vivo excised from the recombinant Uni-ZAP XR vector 
using ExAssist helper phage and the E. coli strain 
XLl-Blue MRF 1 (Stratagene). Excised plasmids were plated 
using SOLR cells." 

BASE COUNT 191 a 141 c 144 g 187 1 

ORIGIN 

1 gtctttatgg gggtgcacaa tatcatcgag caatggctga atttcgtttt gtagttggag 
61 gaatcaagtg ccctccaatt acccgggaag aaattgtaaa tgcttgtgga gttgaagaca 
121 ttcatgatgg aacaaactac tctaggactg cttgtgtaat tgctgttgca aaggctcatg 
181 atacatttga accttttctt catcagttgg ggtctagatt gttgcacata cttaagagat 
241 tgctcccaat ctctttttat cttcttcaga aagattgtga gtatctaagt ggccatcagg 
301 tgttcctcag gcgtgttgcc tccgccttcg acaactttgc agaatccact gaaaaatcat 
361 gccgtgaaaa atgtatggag gacttggtaa gcaccacacg atatgtctca tggtctctac 
421 acaataagag tcgggcagga ttacgccagt tcttagattc atttggtgga acagaacatt 
481 ccaatgtttg taatgatccc actgcaactg ttctatcaca aacaaatgtg caagagaagg 
541 aagacacaaa gccacaacta gaagtaaagc tcagtcacgt ggcctctgga actgatccta 
601 gcacatccac ccagacagct gaaacaaagc ttgctgacct tcttgatagt acactttgga 
661 ate 

// 

</PRE> 
<DL> 
<DT> 

<TABLE cellSpacing-0 cellPadding=0 width="100%"> 
<TBODY> 
<TR> 

<TD><INPUT type=checkbox value=22485477 name-uid><B>4: </B>BU045400. 

PP_LEa0022H05f Pe...[gi:22485477] </TE» 
<TD align=right><SP AN> 

<SCRIPT language-JavaScriptl.2> 

<!- 

varMenu22485477 = [ 
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['Taxonomy'V'window.top.location='/entrez/query.fcgi?db=nucleotide&cmd=Display&dopt= 
nucleotide Jaxonomy&from_uid=22485477'","", m '], 

["Help , ^ l 'window.open('/entrez/query/static/popup.htmlVLinks_Help','resizable=no,scrollbars 

=yes,toolbar^o,location^o4irectories^o,status^o,menubar^o,copyhistory = ^o,width : =40 

0,height=500');","",""] 

] 

//--> 

</SCRIPT> 

<A class=dblinks 

href="javascript:PopUpMenu2_Set(Menu22485477,",",",",");" . 

target=_seie>Links</A></SPAN></TD></TR></TBODY></TABLE></DTx/DL><PRE>L 
OCUS BU045400 

622 bp mRNA linear EST 26-AUG-2002 

DEFINITION PP_LEa0022H05f Peach developing fruit mesocarp Prunus persica cDNA 

clone PP_LEa0022H05f, mRNA sequence. 
ACCESSION BU045400 
VERSION BU045400.1 GI:22485477 
KEYWORDS EST. 
SOURCE Prunus persica (peach) 
ORGANISM <A href="http://www.ncbi.nlm.nih.gov/htbin- 
post/Taxonomy/wgetorg?name=Prunus+persica 
">Prunus persica</A> 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids 
; eurosids I; Rosales; Rosaceae; Amygdaloideae; Prunus. 
REFERENCE 1 (bases 1 to 622) 
AUTHORS Callahan,A., Palmer,M., Main,D., Wing,R. and Abbott,A. 
TITLE Peach Model Genome for Rosaceae 
JOURNAL Unpublished 
COMMENT Contact: Abbott, A. 

Dept of Genetics and Biochemistry 
Clemson University 

122 Long Hall, Clemson University, Clemson, SC 29634, USA 
Tel: 864 656 3060 
Fax: 864 656 6879 

Email: <Ahref="mailto:aalbert@clemson.edu">aalbert@clemson.edu</A> 
Total High Quality bases = 553 
Seq primer: TAATACGACTCACTATAGGG 
High quality sequence stop: 622. 
FEATURES Location/Qualifiers 
source 1..622 

/organism-'Prunus persica" 
/mol_type="mRNA" 

15* 
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/cultivar="Loring" 
Mb xref="<A href=' 



.11 



http://www.ncbi.nlm.nih.^^ 

/elone="PP_LEa0022H05f 
/tissuetype- 'Mesocarp" 
/lab_host="E. coli" 

/clonelib- Teach developing fhiit mesocarp" 
/note="Vector: pBluescript II SK(-); Site l : EcoRI; 
Site_2: Xhol; authority=Prunus persica L. Batsh; The 
sequence has been trimmed to remove vector sequence and 
contains a minimum of 100 bases of phred value 20 or 
* above. For more details on library preparation and 
sequence analysis go to 

<Ahref-'http://www. genome.clemson.edu/projects/peach"> 
http://www.genome.clemson.edu/projects/peach</A>. To order 

this clone go to <A href= M http://www.genome.clemson.edu/orders"> 
http://www.genome.clemson.edu/orders</A>" 
BASE COUNT 168 a 125 c 147 g 181 t 1 others 
ORIGIN 

1 gcttatacct aacgcaggaa tgcgtttata tggtggtgca caataccacc gtgccatggc 
61 tgagttccgc tttgtagttg gaggaataaa atgccctcca attacaaggg aagaaattgt 
121 aaatgcatgt ggagttgaag atttacatga tggcacaaac tactcaagga cagcttgtgt 
181 aatagccgtt gcaaaggccc gtgatacatt tgagcctttc cttcatcagt taggttgtag 
241 actcttgcac attctaaaga gattacttcc tatatcagtc tatcttcttc agaaagatgg 
301 tgagtattta agtggccatg aggtgtttct taggcgtgtt gcttctgctt tcaatgactt 
361 tgcagaatct accgaaaggg catgtcgtga aaaatgcatg gaggatttag taagcaccac 
421 ccgctatgtc acctggtccc ttcacaacaa gaatcgagct gggttacgtc aatttttaga 
481 ctcgttcgct ggaacagaac ataacactat gggtagtaat tgcgtacctg ctggtatttc 
541 ccaagattca tcctttgggt ctgttgccaa tgagaaggat actaagtcaa gggcagatgt 
601 gaagctcanc catgtggcgt ct 

// ^ . 

</PRE> 

<TABLE cellSpacing=0 cellPadding=0 width= M 100%" bgColor=#cccccc> 
<TBODY> 
<TR> 
<TD> 

<TABLE cellSpacing-0 cellPadding=0> 
<TBODY> 
<TR> 

<TD noWrapxlNPUT onclick= M GoV (")" type=button value-Display name- ,m > 
 <SMALL><SELECT onchange=form.view.selectedIndex-selectedIndex 
name-viewl><OPTION value=DocSum>Summary</OPTION> <OPTION 
value=asn>ASN.K/OPTION> <OPTION value=est>EST</OPTION> <OPTION 
value=fasta>FASTA</OPTION> <OPTION value=fasta_xml>TinySeq 
XML</OPTION> <OPTION value-gb selected>GenBank</OPTION> <0PTI0N 
value=gb_xml>GBSeq XML</OPTION> <OPTION value=gi>GI List</OPTION> 




Fig. 26 continued 9/9 



OPTION value=graph>Graphics</OPTlON> <OPTION 
value=xml>XML</OPTION> <OPTION 

value=de£>default</OPTIONx/SELECT></SMALL><SMALL> 

  Show: </SMALL><SELECT 

onchange=form.dispmax.selectedIndex-selectedIndex 
name=dispmaxl><OPTION value=l>K/OPTION> <OPTION 
value=2>2</OPTION> <OPTION value=5>5</OPTION> <OPTION 
value=10>10</OPTION> <OPTION value=20 selected>20</OPTION> OPTION 
value=50>50</OPTION> <OPTION value=100>100</OPTION> <OPTION 
value=200>200</OPTION> <OPTION value=500>500</OPTION></SELECT> 

  <INPUT onclick="GoV 
(form.SendTo.options[form.SendTo.selectedIndex].value,4)" type=button value="Send to" 
name=""> 

 <SELECT onchange=form.SendTo.selectedIndex=selectedIndex 
name=SendTol><OPTION value=on selected>File</OPTION> OPTION 
value=t>Text</OPTION> <OPTION 

value="Add to Clipboard">Clipboard</OPTION></SELECT> 
</TD></TRx/TBODY></TABLE></TD></TR> 
<TR> 
<TD> 

<TABLE cellSpacing=0 cellPadding=0 width="100%"> 
<TBODY> 
<TR> 

<TD align=middle width="50%"> 
OW class=medium2>Items 1-4 of 4</DIV></TD> 
<TD align=right width="100%">One page.</TD></TR><INPUT type=hidden 
value=20 name=showndispmax><INPUT type=hidden value=0 
name=page></TBODY></TABLE></TD></TR></TBODY></TABLEx/FORM><BR> 
<DIV class=mediuml align=center> 

<P><A href="http://www.ncbi.nlm.nih.gov/About/disclaimer.htmr'>Disclaimer</A> | 
<A href="mailto:info@ncbi.nlm.nih.gov">Write to the Help Desk</A><BRxA 
href="http://www.ncbi.nlm.nih.gov/">NCBK/A> | <A 

href= ,, http://www.nlm.nih.gov/">NLM</A> | <A href="http://www.nih.gov/">NIH</A> 
</P> 

<P>&nbsp ;</Px/DrV> 

<P class=dblinks align=rightxFONT color=#eeeeee size=-5>Jun 19 2003 12:37:45 <!- 
ipubmed7 

-x/FONT></P> 

<SCRIPT language=JAVASCRIPT> /* <!-- */ TextFocus (); // --> 
</BODYx/HTML> 
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Fzo-like Homologous Sequences 



1: BG890612. EST5 16463 cSTD So...[gi: 14267734] 

LOCUS BG890612 752 bp mRNA linear EST 07-MAR-2003 

DEFINITION EST5 16463 cSTD Solanum tuberosum cDNA clone cSTD19A23 5' sequence, 

mRNA sequence. 
ACCESSION BG890612 
VERSION BG890612.1 GI: 14267734 
KEYWORDS EST. 
SOURCE Solanum tuberosum (potato) 
ORGANISM Solanum tuberosum 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 

Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; 

asterids; lamiids; Solanales; Solanaceae; Solanum. 
REFERENCE 1 (bases 1 to 752) 
AUTHORS van der Hoeven,R., BezzeridesJ., Ewing,E., Cho,J., Chiemingo,A., 

Bougri,0., Buell,C.R., Ronning,C , Tanksley,S. and Baker,B. 
TITLE Generations of ESTs from dormant potato tubers 
JOURNAL Unpublished 
COMMENT Contact: Robin Buell 

The Institute for Genomic Research 

9712 Medical Center Dr, Rockville, MD 20850, USA 

Email: potato-array@tigr.org 

This clone can be obtained from the University of Arizona Genomics 
Institute. Orders can be made through URL: 
http://genome.arizona.edu/orders/ 
Seq primer: M13F-R. 
FEATURES Location/Qualifiers 
source L.752 

/organism- 'Solanum tuberosum" 

/mol_type="mRNA" 

/cultivar="Kennebec" 

/db_xref="taxon:4113" 

/clone="cSTD19A23" 

/tissue_type="dormant tuber" 

/dev_stage="one month post-harvest" 

/lab_host="SOLR" 

/clone_lib="cSTD" 

/note="Vector: pBluescript SK(-); Site_l; EcoRI; Site_2: 
Xhol; This library targets genes expressed in dormant 
tubers. This library was made from sections of dormant 
tuber, avoiding the buds and epidermis. Tubers were stored 
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for one month post-harvest at 4oC. The tuber was peeled, 
well away from the surface. Then it was chopped into 1-2 
mm cubes and immediately frozen in liquid nitrogen. This 
library is noted as P4 in Tanksley lab notebooks." 

BASE COUNT 226 a 144 c 172 g 210 1 

ORIGIN 

1 gcgaatgtga ttcttcaaag gcaacaaagg ctgacggagg aatttgtgcc tcgtgcagat 
61 ctgcttctgt ttctcatgtc tgctgatcga ccattaactg aaagtgaggt tagttttctg 
121 cgttacactc agcagtggag taagaaggtc atttttgtgc tgaacaagtc tgacatatac 
181 aagaataacg gcgagttgga ggaggccatt gcatttatca aagaaaatac acggaaattg 
241 ctgaatacag aatccgtaac actgtatcca gtatctgcac ggctcgctct tgaatcaaag 
301 ctttctactt ttgatggtgc ccttagtcaa aacaatggga gttcaaataa tgattctcac 
361 tggaaaacca agagcttcta tgagcttgag aagtacttgt ctagcttttt ggattcatcc 
421 acaagtactg gaattgagag aatgaagctg aagcttgaaa ctccaattgc cattgcagaa 
48 1 caactacttt tagcttgtca aggacttgtg agacaagaat gtcagcaagc caaacaagac 
541 ttgctgtttg ttgaggatct tgtcaacagc gtagaagagt gcacaaagaa gctggaagtt 
601 gatagcattc tgtggaagag gcaggttcta tctctgataa actctgctca agcacgtgtt 
661 gtccggcttg tagagtcaac gttacaactg tcaaatgttg atcttgtcgc tacatatgta 
721 ttcagaagag aaaactctac tcaaatgcca gc 

// 

2: AW760673. sl53dl0.yl Gm-cl0...[gi:7692570] 

Links 



LOCUS AW760673 492 bp mRNA linear EST 03-DEC-2001 

DEFINITION sl53dl0.yl Gm-cl027 Glycine max cDNA clone GENOME SYSTEMS 
CLONE ID: 

Gm-cl027-5036 5' similar to SW:YOR6_CALSR P40983 HYPOTHETICAL 

PROTEIN IN XYNA 3'REGION ;, mRNA sequence. 
ACCESSION AW760673 
VERSION AW760673.1 GL7692570 
KEYWORDS EST. 
SOURCE Glycine max (soybean) 
ORGANISM Glycine max 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 

Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids 

; eurosids I; Fabales; Fabaceae; Papilionoideae; Phaseoleae; 

Glycine. 

REFERENCE 1 (bases 1 to 492) 
AUTHORS Shoemaker,R., Keim,P., Vodkin,L., ErpeldingJ., Coryell,V., Khanna 
,A„ Bolla,B., Marra,M., Hillier,L., Kucaba,T., MartinJ., Beck,C, 
Wylie,T., Underwood,K., Steptoe,M., Theising,B-> Allen,M., Bowers 
,Y., Person,B-, Swaller,T., Gibbons ? M., Pape,D., Harvey,N., Schurk 
,R., Ritter,E., Kohn,S., Shin,T., Jackson,Y., Cardenas,M., McCann 
,R., Waterston ? R. and Wilson,R. 
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TITLE Public Soybean EST Proj ect 
JOURNAL Unpublished 
COMMENT Contact: Shoemaker R/Public Soybean EST Project 

Public Soybean EST Project 

Washington University School of Medicine 

4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108, USA 

Tel: 314 286 1800 

Fax: 314 286 1810 

Email: est@watson.wustl.edu 

This clone is available through: ResGen, Invitrogen Corp. 2130 
South Memorial Parkway Huntsville, AL 35801 For further information 
call: (800)-533-4363 or contact via email: ccu@resgen.com 
Insert Length: 2209 Std Error: 0.00 
High quality sequence stop: 411. 
FEATURES Location/Qualifiers 
source 1..492 



/organism- 'Glycine max" 

/mol_type="mRNA" 

/db_xref="taxon:3847" 

/clone- 'GENOME SYSTEMS CLONE ID: Gm-c 1027-5036" 
/tissue_type=="cotyledons of 3- and 7-day-old Williams 
seedlings" 

/labJiost="DH10B" 
/cloneJib="Gm-cl027" 

/note="Vector: pBluescript II SK+; Site_l: EcoRI; Site_2: 
Xhol; This cDNA library was constructed from mRNA isolated 
from cotyledons of 3- and 7-day-old Williams seedlings 
which were propagated on paper towels with distilled 
water. The cotyledons were flash- frozen in liquid 
nitrogen, then lyophilized for 72 hours. Unequal amounts 
of mRNA was used for cDNA synthesis. Stratagene's cDNA 
Synthexix Kit (catalog number 200401) was used to 
synthesize the cDNA. First- stranded synthesis was 
performed with 5 -methyl dCTP, hence the ligated cDNA was 
hemimethylated. A modification of Stratagene's 
first-strand synthesis primer was used. An anchor 
nucleotide (V=A, C, or G) was added to the 3 f end of the 

primer [GAGAGAG AGAGAGAGAGAGAACTAGTCTCGAG(T) 1 8] to anchor 

the primer at the 5' end of the poly(A) tract. After 

second- strand synthesis, the cDNA ends were filled in 

with cloned Pfu DNA, ligated to EcoRI adapters and 

subsequently phosphorylated. The Xhol site within the 

first-strand synthesis primer was then restricted by 

digestion with Xhol; all Xhol sites in the cDNA would be 

protected by their hemimethylated status. The cDNA 

constructs were size-fractionated with a 500 bp cutoff, 
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using GibcoBRL Life Technologies' cDNA Size Fractionation 
column. The column eluent was then ligated into 
Stratagene's pBluescript(tm) II XR Predigested vector 
(pBluescript II SK(+) that has been digested with EcoRI 
and Xhol, and phosphorylated by Stratagene). 97% of the 
white and blue colonies appear to contain recombinant 
plasmids with cDNA inserts, based on size (n=30). This 
library was constructed by Dr. Paul Keim and Dr. Virginia 
Coryell." 

BASE COUNT 135 a 91c 108 g 158 t 
ORIGIN 

1 tgttgaatga agctattgaa gctatcaaga gggctgcacc tctgatggag gaggtttcac 
61 ttcttaatga tgcggtttct caaattgatg agccattctt actggttata gtgggggaat 
121 tcaactctgg taaatctacc gtgattaatg cgcttcttgg agaaagatat ctcaaagagg 
181 gagttgttcc aacaactaat gagatcacat ttttacgata tactgactta gatattgaac 
241 aacaacggtg tgaaaggcat ccagatggcc aatatatttg ctacattcct gctccaattc 
301 ttaaagagat gaccattgtt gatacacctg gaactaatgt gattcttcag aggcagcagc 
361 gtcttacaga ggaatttgta ccccgtgcag atttacttct ttttgtcatt tctgctgatc 
421 gccctttaac tggaagtgag attgcttttc ttcgttattc tcagcagtgg aaaaagaaag 
481 cggtctttgtct 

// :. ' ; , ........ 

3: BE353824. EST355167 tomato ...[gi: 929 1800] 

Links 



LOCUS BE353824 446 bp mRNA linear EST 18-MAY-2001 

DEFINITION EST355 167 tomato flower buds, anthesis, Cornell University 
Lycopersicon esculentum cDNA clone cTOD6M4, mRNA sequence. 
ACCESSION BE353824 
VERSION BE353824.1 GL9291800 
KEYWORDS EST. 

SOURCE Lycopersicon esculentum (tomato) 
ORGANISM Lycopersicon esculentum 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; 
asterids; lamiids; Solanales; Solanaceae; Solanum; Lycopersicon. 
REFERENCE 1 (bases 1 to 446) 
AUTHORS van der Hoeven,R.S., BezzeredesJ.L., Matern,A.L., HoltJ.E., Liang 
,F., Hansen,T.S., Craven,M.B. ? Bowman,C.L., Ronning,C.M., Nierman 
,W., Fraser,C.M. 5 Martin,G.B., GiovannoniJJ. and Tanksley,S D. 
TITLE Generation of ESTs from tomato flower tissue, anthesis 
JOURNAL Unpublished 
COMMENT Contact: CUGI 

Clemson University Genomics Institute 
Clemson University 
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100 Jordan Hall, Clemson, SC 29634, USA 
Email: http://www.genome.clemson.edu/orders/index.html 
5 prime sequence. 
FEATURES Location/Qualifiers 
source 1..446 

/organism-'Lycopersicon esculentum" 

/mol_type="mRNA" 

/cultivai="TA496" 

/db_xref="taxon:4081" 

/clone="cTOD6M4" 

/tissue Jype-'flower" 

/dev_stage= " anthesi s " 

/clone Jib- 'tomato flower buds, anthesis, Cornell 
University" 

/note="Vector: pBlueScript SK(-); Site J: EcoRl; Site_2: 
Xhol; supplier: Tanksley; Flower buds and flowers were 
taken from greenhouse plants (4-8 wks old, TA496). They 
were immediately frozen in liquid nitrogen and then 
size-separated while remaining frozen." 

BASE COUNT 119 a 82 c 116 g 129 1 

ORIGIN 

1 gagaccatta agtacaattc tataagcagt cttttgaaaa aagatggact tcattggtga 
61 atccgtctga ccaaattgag ttaggaacaa ctggtgtgct ggatagaaaa tctgaagtta 
121 ccataagtgt catagaggat ttcagtgctg cagctgcttc aaaattgctt gagagagata 
181 ttcgtgaagt gttcttgggt acttttggtg gtcttggagc agctggttta tcagcgtcgc 
241 ttctgacatc tgttcttcaa accacattag aagacctcct tgcacttggc ctttgttctg 
301 ctggcgggtt attagcggtc ttcaacttct catcccggag acagcaagtg gtagataaag 
361 taaagaggac tgctgatggc ctttcacgtg aactcgaaga ggctatgcag aaggagctct 
421 tggagacgac tagtaatgtg gaggac 

// 

4: BI136291. F066P17Y Populus ...[gi: 18017219] 

Links 



LOCUS BI136291 521 bp mRNA linear EST 31-DEC-2001 

DEFINITION F066P17Y Populus flower cDNA library Populus balsamifera subsp. 

trichocarpa cDNA, mRNA sequence. 
ACCESSION BI1 36291 
VERSION BI136291.1 01:18017219 
KEYWORDS EST. 

SOURCE Populus balsamifera subsp. trichocarpa 
ORGANISM Populus balsamifera subsp. trichocarpa 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; rosids 
; eurosids I; Malpighiales; Salicaceae; Populus. 



Fig. 27, continued 6/6 



REFERENCE 1 (bases 1 to 521) 
AUTHORS Hertzberg,M., Aspeborg,H., Erlandsson,R., Bjorkbacka,H., Hiltonen 
,T.> KarlsspnJ., Teeri,T., Gustafsson,P., Bahlerao,R., Jansson,S., 
Nilsson,0., Sundberg,B-, Nilsson,P., Uhlen,M., Sandberg,G. and 
LundebergJ. 
TITLE Gene expression in Populus 
JOURNAL Unpublished 
COMMENT Contact: Erlandsson R 
Department of Biotechnology 
Royal Institute of Technology 
Teknikringen 30, Stockholm S-10044* Sweden 
Tel: 46 8 790 8287 
Fax: 46 8 245452 
Email: rikerl@biochem.kth.se. 
FEATURES Location/Qualifiers 
source 1..521 

/organism- 'Populus balsamifera subsp. trichocarpa" 
/mol_type= ,, mRNA M 
/sub_species= M trichocarpa M 
/db_xref="taxon:3694" 
/clone_lib- 'Populus flower cDN A library" 
/note= u Organ: flower" 
BASE COUNT 143 a 87 c 135 g 156 1 
ORIGIN 

1 tggtgttgtg ctgtctgatc aagggcttcc tgcccttgtg gcaagaaata tgatgatggg 
61 ttctcgaact gaatcagttg ttctaccttt ggtagccagg attgtgcaga caccatatgc 
121 tgcattaaat gcgtctaatt ctgaaggtgc tgattttctt atatatgttc atggcccaga 
181 ggatgatcct gatgtagaaa tgagccctgg attcgggaat gtgaagatac caatctttgt 
241 cctcaatgct tcacgtgggg aggacacatt gtcggtgggg gcatcaaaat ttctgaaaac 
301 cggtgctagt ggtttagttc tgtcattgga agatttgagg ttatttagcg atgatgcttt 
361 gagtcagatg tttgacactc tgagtgcaac cggtaaaaac tttcaggatg accttgaaag 
421 cttcagtaag ctcaaatcta tggatatgga aaatgatatt catgaaaaaa caacggtggc 
481 aggctttgtt aaactggagg atagagaaaa acagctcata g 



